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MASS LABELS 

This invention relates to useful compounds for labelling molecules of interest, particularly 
biomolecules such as peptides and proteins. Specifically this invention relates to labelling 
of analytes for detection by mass spectrometry and associated methods of analysing mass 
labelled analytes by mass spectrometry. 

Various methods of labelling molecules of interest are known in the art, including 
radioactive atoms, fluorescent dyes, luminescent reagents, electron capture reagents and 
light absorbing dyes. Each of these labelling systems has features which tnalre it suitable 
for certain applications and not others. For reasons of safety, interest in non-radioactive 
labelling systems lead to the widespread commercial development of fluorescent labelling 
schemes particularly for genetic analysis. Fluorescent labelling schemes permit the 
labelling of a relatively small number of molecules simultaneously, typically 4 labels can 
be used simultaneously and possibly up to eight However the costs of the detection 
apparatus and the difficulties of analysing the resultant signals limit the number of labels 
that can be used simultaneously in a fluorescence detection scheme. 

More recently there has been development in the area of mass spectrometry as a method 
of detecting labels that are cleavably attached to their associated molecule of interest In 
many molecular biology applications one needs to be able to perform separations of the 
molecules of interest prior to analysis. These are generally liquid phase separations. Mass 
spectrometry in recent years has developed a number of interfaces for liquid phase 
separations which make mass spectrometry particularly effective as a detection system for 
these kinds of applications. Until recently Liquid Chromatography Mass Spectrometry 
was used to detect analyte ions or their fragment ions directly, however for many 
applications such as nucleic acid analysis, the structure of the analyte can be determined 
from indirect labelling. This is advantageous particularly with respect to the use of mass 
spectrometry because complex biomolecules such as DNA have complex mass spectra 
and aire detected with relatively poor sensitivity. Indirect detection means that an 
associated label molecule can be used to identify the original analy te, where the label is 
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designed for sensitive detection and a simple mass spectrum. Simple mass spectra mean 
that multiple labels can be used to analyse multiple analytes simultaneously. 

PCT/GB98/00127 describes arrays of nucleic acid probes covalently attached to cleavable 
labels that are detectable by mass spectrometry which identify the sequence of the 
covalently linked nucleic acid probe. The labelled probes of this application have the 
structure Nu-L-M where Nu is a nucleic acid covalently linked to L, a cleavable linker, 
covalently linked to M, a mass label. Preferred cleavable linkers in this application cleave 
within the ion source of the mass spectrometer. Preferred mass labels are substituted poly- 
aryl ethers. These application discloses a variety of ionisation methods and analysis by 
quadrupole mass analysers, TOF analysers and magnetic sector instruments as specific 
methods of analysing mass labels by mass spectrometry. 

PCT/GB94/01675 disclose ligands, and specifically nucleic acids, cleavably linked to 
mass tag molecules. Preferred cleavable linkers are photo-cleavable. These application 
discloses Matrix Assisted Laser Desorption Ionisation (MALDI) Time of Flight (TOF) 
mass spectrometry as a specific method of analysing mass labels by mass spectrometry. 

PCT/US97/22639 discloses releasable non-volatile mass-label molecules. In preferred 
embodiments these labels comprise polymers, typically biopolymers which are cleavably 
attached to a reactive group or ligand, ie. a probe. Preferred cleavable linkers appear to 
be chemically or enzymatically cleavable. This application discloses MALDI TOF mass 
spectrometry as a specific method of analysing mass labels by mass spectrometry. 

PCT/US97/01070, PCT/US97/01046, and PCT/US97/01304 disclose ligands, and 
specifically nucleic acids, cleavably linked to mass tag molecules. Preferred cleavable 
linkers appear to be chemically or photo-cleavable. These application discloses a variety 
of ionisation methods and analysis by quadrupole mass analysers, TOF analysers and 
magnetic sector instruments as specific methods of analysing mass labels by mass 
spectrometry. 
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None of these prior art applications mention the use of tandem or serial mass analysis for 
use in analysing mass labels. 

Gygi et al (Nature Biotechnology 17: 994-999, "Quantitative analysis of complex protein 
mixtures using isotope-coded affinity tags" 1999) disclose the use of 'isotope encoded 
affinit y tags' for the capture of peptides from proteins, to allow protein expression 
analysis. In this article, the authors describe the use of a biotin linker, which is reactive to 
thiols, for the capture peptides with cysteine in them. A sample of protein from one 
source is reacted with the biotin linker and cleaved with an endopeptidase. The 
biotinylated cysteine-containing peptides can then be isolated on avidinated beads for 
subsequent analysis by mass spectrometry. Two samples can be compared quantitatively 
by labelling one sample with the biotin linker and labelling the second sample with a 
deuterated form of the biotin linker. Bach peptide in the samples is then represented as a 
pair of peaks in the mass spectrum. Integration of the peaks in the mass spectrum 
corresponding to each tag indicate the relative expression levels of the peptide linked to 
the tags. 

This 'isotope encoding 1 method has a number of limitations. A first is the reliance on the 
presence of thiols in a protein - many proteins do not have thiols while others have 
several. In a variation on this method, linkers may be designed to react with other side 
chains, such as amines. However, since many proteins contain more than one lysine 
residue, multiple peptides per protein would generally be isolated in this approach. It is 
likely that this would not reduce the complexity of the sample sufficiently for analysis by 
mass spectrometry. A sample that contains too many species is likely to suffer from 'ion 
suppression', in which certain species ionise preferentially over other species which would 
normally appear in the mass spectrum in a less complex sample. In general, capturing 
proteins by their side chains is likeiy to give either too many peptides per protein or 
certain proteins will be missed altogether. 

The second limitation of this approach is the method used to compare the expression 
levels of proteins from different samples. Labelling each sample with a different isotope 
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variant of the affinit y tag results in an additional peak in the mass spectrum for each 
peptide in each sample. This means that if two samples are analysed together there will 
be twice as many peaks in the spectrum. Similarly, if three samples are analysed together, 
the spectrum will be three times more complex than for one sample alone. It is clear that 
this approach will be limited, since the ever increasing numbers of peaks will increase the 
likelihood that two different peptides will have overlapping peaks in the mass spectrum. 

A further limitation, which is reported by the authors of the above paper, is the mobility 
change caused by the tags. The authors report that peptides labelled with the deuterated 
biotin tag elute slightly after the same peptide labelled with the undeuterated tag. 

The mass spectra generated for analyte material are very sensitive to contaminants. 
Essentially, any material introduced into the mass spectrometer that can ionise will appear 
in the mass spectrum. This means that for many analyses it is necessary to carefully purify 
the analyte before introducing it into the mass spectrometer. For the purposes of high 
throughput systems for indirect analysis of analytes through mass labels it would be 
desirable to avoid any unnecessary sample preparation steps. That is to say it would be 
desirable to be able to detect labels in a background of contaminating material and be 
certain that the peak that is detected does in fact correspond to a label. The prior art does 
not disclose methods or compositions that can improve the signal to noise ratio achievable 
in mass spectrometry based detection systems or that can provide confirmation that a mass 
peak in a spectrum was caused by the presence of a mass label 

For the purposes of detection of analytes after liquid chromatography or electrophoretic 
separations it is desirable that the labels used, minimally interfere with the separation 
process. If an array of such labels are used, it is desirable that the effect of each member 
of the array on its associated analyte is the same as every other label. This conflicts to 
some extent with the intention of mass marking which is to generate arrays of labels that 
are resolvable in the mass spectrometer on the basis of their mass. It is disclosed in the 
prior art above that mass labels should preferably be resolved by 4 Daltons to prevent 
interference of isotope peaks from one label with those of another labeL This means that 
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to generate 250 distinct mass labels would require labels spread over a range of about 
1000 Daltons and probably more, since it is not trivial to generate large arrays of labels 
separated by exactly 4 Daltons. This range of mass will almost certainly result in mass 
labels that will have a distinct effect on any separation process that precedes detection by 
mass spectrometry. It also has implications for instrument design, in that as the mass 
range over which a mass spectrometer can detect ions increases, the cost of the instrument 
increases. 

It is thus an object of this invention to solve the problems associated with the above prior art, 
and to provide mass labels which can be detected in a background of contamination and 
whose identity as mass labels can be confirmed Furthermore it is an object of this 
invention to provide arrays of labels which can be resolved in a compressed mass range so 
that the labels do not interfere as much with separation processes and which can be 
detected easily in a mass spectrometer that detects ions over a limited range of mass to 
charge ratios. 

It is also an object of this invention to provide methods of analysing biomolecules which 
exploit the labels of this invention to maximise throughput, signal to noise ratios and 
sensitivity of such assays, particularly for the analysis of peptides. 

In a first aspect the invention provides a set of two or more mass labels, each label in the 
set comprising a mass marker moiety attached via at least one amide bond to a mass 
normalisation moiety, wherein the aggregate mass of each label in the set may be the 
same or different and the mass of the mass marker moiety of each label in the set may be 
the same or different, and wherein in any group .of labels within the set having a mass 
marker moiety of a common mass each label has an aggregate mass different from all 
other labels in that group, and wherein in any group of labels within the set having a 
common aggregate mass each label has a mass marker moiety having a mass different 
from that of all other mass marker moieties in that group, such that all of the mass labels 
in the set are distinguishable from each other by mass spectrometry, and wherein the mass 
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marker moiety comprises an ammo acid and the mass normalisation moiety comprises an 
amino acid. 

The term mass marker moiety used in the present context is intended to refer to a moiety 
that is to be detected by mass spectrometry, whilst the term mass normalisation moiety 
used in the present context is intended to refer to a moiety that is not necessarily to be 
detected by mass spectrometry, but is present to ensure that a mass label has a desired 
aggregate mass. The number of labels in the set is not especially limited, provided that 
the set comprises a plurality of labels. However, it is preferred if the set comprises two or 
more, three or more, four or more, or five or more labels. 

The present invention also provides an array of mass labels, comprising two or more sets 
of mass labels as defined above, wherein the aggregate mass of each of the mass labels in 
any one set is different from the aggregate mass of each of the mass labels in every other 
set in the array. The mass marker moiety and the mass normalisation moiety both 
comprise at least one amino acid. However, the moieties may comprises further groups, if 
desired, such as more amino acid groups, and/or aryl ether groups. Thus the moieties may 
be modified amino acids, or may be peptides. The masses of the different sets in the array 
may be distinguished by adding further amino acid groups to either or both of the moieties 
as required. 

Further provided by the invention is a method of analysis, which method con^rises 
detecting an analyte by identifying by mass spectrometry a mass label or a combination of 
mass labels unique to the analyte, wherein the mass label is a mass label from a set or an 
array of mass labels as defined above. 

In certain embodiments of this invention the mass tags may comprise reactive 
functionalities which facilitate the attachment of the mass tags to analyte molecules. The 
tags in this embodiment are preferably of the following form: 



amino add 1 - amide bond - amino acid 2- reactive functionality 
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where the mass marker moiety and the mass normalisation moiety may each be either 
amino acid 1 or amino acid 2. 

In preferred embodiments of the invention, the array of tags are preferably all chemically 
identical and the masses of the mass normalisation and mass marker moieties (e.g. amino 
acid 1 and acid 2 above) are altered by isotope substitutions. 

In further preferred embodiments of this invention, the tags may comprise a sensitivity 
enhancing group. The tags are preferably of the fonru 

sensitivity enhancing group - amino acid 1 - amide bond - amino acid 2 - reactive 

functionality 

In this example the sensitivity enhancing group is usually attached to the mass marker 
moiety, since it is intended to increase the sensitivity of the detection of this moiety in the 
mass spectrometer. The reactive functionality is shown as being present and attached to a 
different moiety than the sensitivity enhancing group. However, the tags need not be 
limited in this way and in some cases comprise the sensitivity enhancing group without 
the reactive functionality. In other embodiments the sensitivity enhancing group may be 
attached to the same moiety as the reactive functionality. 

In certain embodiments* of the invention the mass tags comprise an affinity capture 
reagent Preferably, the affinity capture ligand is biotin. The affinity capture ligand 
allows labelled analytes to be separated from unlabelled analytes by capturing them, e.g. 
on an avidinated solid phase. 



In a further aspect the invention provides a method of analysing a biomolecule or a 
mixture of biomolecules. This method preferably comprises the steps of 
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L Reacting the biomolecule or mixture of biomolecules with a mass marker 
according to this invention; 

2. Optionally separating the labelled biomolecule electrophoretically or 
chromatographically; 

3. Ionising the labelled biomolecule; 

4. Selecting ions of a predetermined mass to charge ratio corresponding to the mass 
to charge ratio of the preferred ions of the labelled biomolecule in a mass analyser; 

5. Inducing dissociation of these selected ions by collision; 

6. Detecting the collision products to identify collision product ions that are 
indicative of the mass labels. 

In this embodiment, where the mass tags comprise an affinity tag, the affinity tagged 
biomolecules may be captured by a counter-ligand to allow labelled biomolecules to be 
separated from unlabelled biomolecules. This step preferably takes place prior to the 
optional second step above. 

In certain embodiments the step of selecting the ions of a predetemrined mass to charge 
ratio is performed in the first mass analyser of a serial instrument The selected ions are 
then channelled into a separate collision cell where they are collided with a gas or a solid 
surface according to the fourth step of the first aspect of the invention. The collision 
products are then channelled into a further mass analyser of a serial instrument to detect 
collision products according to the fifth step of the first aspect of this invention. Typical 
serial instruments include triple quadrupole mass spectrometers, tandem sector 
instruments and quadrupole time of flight mass spectrometers. 

In other embodiments, the step of selecting the ions of a predetermined mass to charge 
ratio, the step of colliding the selected ions with a gas and the step of detecting the 
collision products are performed in the same zone of the mass spectrometer. This may 
effected in ion trap mass analysers and Fourier Transform Ion Cyclotron Resonance mass 
spectrometers, for example. 
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In another aspect, this invention provides sets or arrays of mass labelled molecules of the 
form: 

analyte - linker - label 

where label is a mass marker from a set or array according to this invention, the linker is a 
linker as described below and analyte may be any analyte of interest such as a 
biomolecule. One preferred aspect of this embodiment is where the analytes (one, more 
than one or even all the analytes) in the set or array are standard analytes with a known 
mass or with predetermined chromatographic properties. Such standards can be 
employed in the methods of the present invention for comparison with unknown analytes, 
for example when analysing the results of a chromatographic separation step. 

This invention describes mass markers that may be readily produced in a peptide 
synthesiser. Indeed, the compounds used in this invention comprises peptides and 
modified peptides. Peptide synthesis provides chemical diversity allowing for a wide 
range of markers with chosen properties to be produced in an automated fashion. 

The term 'MS/MS* in the context of mass spectrometers refers to mass spectrometers 
capable of selecting ions, subjecting selected ions to Collision Induced Dissociation (GOD) 
and subjecting the fragment ions to further analysis. 

The term 'serial instrument' refers to mass spectrometers capable of MS/MS in which 
mass analysers are organised in series and each step of the MS/MS process is performed 
one after the other in linked mass analysers. Typical serial instruments include triple 
quadrupole mass spectrometers, tandem sector instruments and quadrupole time of flight 
mass spectrometers. 

The invention will now be described in further detail by way of example only, with 
reference to the accompanying drawings, in which: 
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Figure 1 shows a set of 3 mass tags derived from lysine; 

Figure 2 shows a set of 5 mass tags derived from alanine; 

Figure 3 shows a set of 5 mass tags derived from alanine and tyrosine; 

Figure 4 shows a set of 4 mass tags derived from fluorinated forms of phenylglycine; 

Figure 5 shows a set of 4 mass tags derived from fluorinated forms of phenylglycine and 
phenylalanine; 

Figure 6a shows a set of 2 affinity ligand mass tags derived from methionine with a 
hydrazide functionality for labelling carbohydrates; 

Figure 6b shows a set of 2 affinity ligand mass tags derived from methionine with a 
boronic acid functionality for labelling carbohydrates; 

Figure 7 shows a set of 2 affinity ligand mass tags derived from methionine with a thiol 
functionality for labelling dehydroalanine and methyldehycfroalanine residues; 

Figure 8 shows a set of 2 affinity ligand mass tags derived from methionine with a 
maleirmde functionality for labelling free thiols; 

Figure 9a shows a synthetic pathway for the preparation of an FMOC protected, 
deuterated methionine residue and figure 9b shows a synthetic pathway for the 
preparation of a reactive linker that can act as a sensitivity enhancer, 

Figure 10 shows a pair of example peptides derived from different isotopic forms of 
methionine synthesised to demonstrate the features of this invention; 
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Figure 1 1 shows an electrdspray mass spectrum of a mixture of the two peptides shown in 
Figure 10; 

Figure 12 shows an eleotrospray spectrum of the fragmentation of each of the two 
peptides shown in Figure 10; 

Figure 13 shows a hypothetical fragmentation mechanism that is likely to account for the 
spectra shown in Figures 12 and 14; 

Figure 14 shows an eleotrospray spectrum of the fragmentation of a 70:30 mixture of the . 
two peptides shown in Figure 1 0; 

Figure 15 shows a graph displaying the expected ratios of peptides A and B (Figure 10) 
against observed ratios of peptides A and B found in a series of ESI-MS/MS analyses of 
mixtures of A an B; 

Figures 16a-16c depict proposed fragmentation mechanisms; 

Figures 17a-17d illustrate tags which exploit enhancing cleavage at the cleavable amide 
bond; 

Figures 18a and 1 8b show the structures of two versions of the TMT markers; 

Figure 19a and 19b show typical CED spectra for a peptide labelled with the first 
generation TMT at collision energies of 40V (Figure 19a) and 70V (Figure 19b); 

Figure 20a 20b and 20c show MS and MS/MS spectra for triply charged ions of the 
peptide 2 (see Table 7) labelled with the first and second generation TMTs; 

Figure 21 shows a typical CID spectra for a peptide (peptide 2 in Table 7) labelled with a 
second generation TMT; 
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Figure 22 shows that the charge state of the TMT tagged peptide does not affect the 
appearance of the tag fragments in the ODD spectra of the labelled peptides; 

Figure 23 shows peptide mixtures with the expected and measured abundance ratios for 
both the first and second generation tags; 

Figure 24 shows the co-elution of each peptide pair, peptides A and B for each peptide 
from Table 7; 

Figure 25 shows a dynamic range study of TMT peptide pairs 3A/3B, which are present 
in a ratio of 40:60 and have been analysed at dilutions in the range from 100 fmole to • 
lOOpmole; and 

Figures 26a 26b and 26c show the results of a spiking experiment in which peptides pairs 
3 A and 3B (500 finol in total, in a ratio of 40:60 respectively) bearing a second generation 
TMT was mixed with a tryptic digest of Bovine Serum Albumin (2 pmol). 

Figures 1 to 5 illustrate a number of important features of the tags of this invention. The 
tags in all of figures 1 to 5 are shown linked to a 'reactive functionality', which could be a 
linker to an N-hydroxysucdnimide ester for example or any of a number other 
possibilities some of which are discussed below. Figures 1, 2 and 4 show that a number 
of tags can be generated by co mbining different mass modified forms of the same amino 
acid into a series of dipeptides. Figures 3 and 5 show sets of tags, which are created by 
combining different amino acids in heterodimers. Figures 1 to 3 illustrate tags, which all 
have the same total mass and which are chemically identical. These tags differ in the 
distribution of isotopes in the molecules, while Figures 4 and 5 which all have the same 
total mass but which are not chemically identical, these tags differ in the distribution of 
fluorine substituents in the tags. 
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Figure 1 will now be discussed in more detail. Figure 1 shows 3 homodimers of lysine. 
The lysine has been blocked at the epsilon amino groups with methylsulphonyl chloride. 
The sulphonamide linkage is more resistant to fragmentation than a conventional amide 
linkage, so that the capping group will not be lost when the tag is fragmented in a mass 
spectrometer using collision induced dissociation at energies sufficient to cleave the 
conventional backbone amide bond between the pair of modified lysine residues. The 
capping group is used to inhibit protonation at the epsilon position during ionisation of 
the tags in a mass spectrometer. The capped lysine can be prepared prior to synthesis of 
the mass tags. The epsilon amino group can be selectively modified by coupling the 
amino acid with methylsulphonyl chloride in the presence of copper ions, for example. 
Amine and acid functionalities at the alpha position can form chelates with various 
divalent cations making the alpha amino group unreactive. The alpha-anrino group of the 
dipeptide has been converted to a guanidino-group to promote protonation at this position 
in the tag during ionisation in a mass spectrometer and to differentiate the mass of the 
fragmentation product from the second alanine residue and natural alanine residues in 
protein. The guanidination of the alpha-position can be performed as the last step of a 
conventional peptide synthesis before deprotection of the peptide and cleavage from the 
resin (Z. Tian and R.W. Roeske, Iht J. Peptide Protein Res. 37: 425-429, "Guanidination 
of a peptide side chain amino group on a solid support", 1991). Different deuterated 
forms of lysine would be used to prepare the three different tags. The total mass of each 
of the three tags is the same but the N-terminal lysine in each tag differs from the other 
two by at least four Daltons. This mass difference is usually sufficient to prevent natural 
isotope peaks from fragmented portions of each tag from overlapping in the mass 
spectrum with the isotope peaks of the fragmented portions of other tags. 

Figure 2 will now be discussed in more detail. Figure 2 shows 5 homodimers of alanine. 
Different isotopically substituted forms of alanine would be used to prepare the five 
different tags. The total mass of each of the five tags is the same but the N-terminal 
alanine in each tag differs from the other four by at least one Dalton. The alpha amino 
group of the dipeptide tag has been methylated to differentiate the fragmentation product 
of this amino acid from the fragmentation product of the second alanine residue and the 
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natural alanine residues in the protein and to promote protonation at this position in the 
tag during ionisation in a mass spectrometer. 

Figure 3 will now be discussed in more detail. Figure 3 shows 5 heterodimers of alanine 
and tyrosine. Different isotopicaUy substituted forms of alanine and tyrosine would be 
used to prepare the five different tags. The total mass of each of the five tags is the same 
but the N-tenrrinal alanine in each tag differs from the other four by at least one dalton. 
The alpha amino group of the dipeptide tag has been methylated to differentiate the 
fragmentation product of this amino acid from the fragmentation products of natural 
alanine residues in the protein and to promote protonation at this position in the tag during 
ionisation in a mass spectrometer. 

Figure 4 will now be discussed in more detail. Figure 4 shows 4 dimers of phenylglycine. 
Different fluorine substituted forms of phenylglycine would be used to prepare the 4 
different tags. The total mass of each of the 4 tags is the same but the N-terminal 
phenylglycine in each tag differs from the other 3 tags by the mass of at least one fluorine 
atom. The alpha amino group of the dipeptide tag has been methylated to differentiate the 
fragmentation product of this amino acid from the fragmentation product of the second 
phenylglycine residue and to promote protonation at this position in the tag during 
ionisation in a mass spectrometer. 

Figure 5 will now be discussed in more detail. Figure 5 shows 4 dimers comprising 
phenylglycine and phenylalanine. Different fluorine substituted forms of phenylglycine 
and phenylalanine would be used to prepare the 4 different tags. The total mass of each 
of the 4 tags is the same but the N-terminal alanine in each tag differs from the other 3 
tags by the mass of at least one fluorine atom. The alpha amino group of the dipeptide tag 
has been methylated, although this serves only to protect the amino group from side 
reactions and to increase protonation as it is not necessary to differentiate the first amin o 
acid as the fragmentation product without methylation would be different from the second 
amino add residue of the tag peptide. The alpha amino group could be modified to 
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promote protonation at this position in the tag during ionisation in a mass spectrometer by 
methylation or guanidination if this is desirable. 

The present invention will now be described in more detail. In one preferred 
embodiment, the present invention provides a set of mass labels as defined above, in 
which each label in the set has a mass marker moiety having a common mass and each 
label in the set has a unique aggregate mass. 

In an alternative, more preferred embodiment, each label in the set has a common 
aggregate mass and each label in the set has a mass marker moiety of a unique mass. 

The set of labels need not be limited to the two preferred embodiments described above, 
and may for example comprise labels of both types, provided that all labels are 
distinguishable by mass spectrometry, as outlined above. 

It is preferred that, in a set of labels of the second type, each mass marker moiety in the 
set has a common basic structure and each mass normalisation moiety in the set has a 
common basic structure, and each mass label in the set comprises one or more mass 
adjuster moieties, the mass adjuster moieties being attached to or situated within the basic 
structure of the mass marker moiety and/or the basic structure of the mass normalisation 
moiety. In this embodiment, every mass marker moiety in the set comprises a different 
number of mass adjuster moieties and every mass label in the set has the same number of 
mass adjuster moieties. 

Throughout this description, by common basic structure, it is meant that two or more 
moieties share a structure which has substantially the same structural skeleton, backbone 
or core. This skeleton or backbone may be for example comprise one or more amino 
acids. Preferably the skeleton comprises a number of amino acids linked by amide bonds. 
However, other units such as aryl ether units may also be present The skeleton or 
backbone may comprise substituents pendent from it, or atomic or isotopic replacements 
within it, without changing the common basic structure. 
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Typically, a set of mass labels of the second type referred to above cornprises mass labels 
with the formula: 

M(A)y-L-XCA) z 

wherein M is the mass normalisation moiety, X is the mass marker moiety, A is a mass 
adjuster moiety, L is the cleavable linker comprising the amide bond, y and z are integers 
of 0 or greater, and yfz is an integer of 1 or greater. Preferably M is a fragmentation 
resistant group, L is a linker that is susceptible to fragmentation on collision with another 
molecule or atom and X is preferably a pre-ionised, fragmentation resistant group. The 
sum of the masses of M and X is the same for all members of the set Preferably M and X 
have the same basic structure or core structure, this structure being modified by the mass 
adjuster moieties. The mass adjuster moiety ensures that the sum of the masses of M and 
X in is the same for all mass labels in a set, but ensures that each X has a distinct (unique) 
mass. 

The present invention also encompasses arrays of a plurality of sets of mass labels. The 
arrays of mass labels of the present invention are not particularly limited, provided that 
they contain a plurality of sets of mass labels according to the present invention. It is 
preferred that the arrays comprise two or more, three or more, four or more, or five or 
more sets of mass labels. Preferably each mass label in the array has either of the 
following structures: 

(S^MCA^XCA^ 
MCAyCS^-L-XCA^ 

wherein S is a mas s series modifying group, M is the mass normalisation moiety, X is the 
mas s marker moiety, A is the mass adjuster moiety, L is the cleavable linker comprising 
the amide bond, x is an integer of 0 or greater, y and z are integers of 0 or greater, and yfz 
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is an integer of 1 or greater. The mass series modifying group separates the masses of the 
sets from each other. This group may be any type of group, but is preferably an amino 
acid, or aryl ether group. Sets may be separated in mass by comprising a different number 
of amino adds in their moieties than other tags from different sets. 

Linker Groups 

In the discussion above and below reference is made to linker groups which may be used 
to connect molecules of interest to the mass label compounds of this invention. A variety 
of linkers is known in the art which may be introduced between the mass labels of this 
invention and their covalehtly attached analyte. Some of these linkers may be cleavable. 
Oligo- or poly-ethylene glycols or their derivatives may be used as linkers, such as those 
disclosed in Maskos, U. & Southern, BJVL Nucleic Acids Research 20: 1679 -1684, 1992. 
Succinic acid based linkers are also widely used, although these are less preferred for 
applications involving the labelling of oligonucleotides as they are generally base labile 
and are thus incompatible with the base mediated de-protection steps used in a number of 
oligonucleotide synthesisers. 

Propargylic alcohol is a bifiinctional linker that provides a linkage that is stable under the 
conditions of oligonucleotide synthesis and is a preferred linker for use with this 
invention in relation to oligonucleotide applications. Similarly 6-aminohexanol is a 
useful bifunctional reagent to link appropriately functionalised molecules and is also a 
preferred linker. 

A variety of known cleavable linker groups may be used in conjunction with the 
compounds of this invention, such as photocleavable linkers. Orfho-nitrobenzyl groups 
are known as photocleavable linkers, particularly 2-nitrobenzyl esters and 
2-nitrobenzylamines, which cleave at the benzylamine bond. For a review on cleavable 
linkers see Lloyd-Williams et ah % Tetrahedron 49, 11065-11133, 1993, which covers a 
variety of photocleavable and chemically cleavable linkers. 



\ 
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WO 00/02895 discloses the vinyl sulphone compounds as cleavable linkers, which are 
also applicable for use with this invention, particularly in applications involving the 
labelling of polypeptides, peptides and amino acids. The content of tins application is 
incorporated by reference. 

WO 00/02895 discloses the use of silicon compounds as linkers that are cleavable by base 
in the gas phase. These linkers are also applicable for use with this invention, particularly 
in applications involving the labelling of oligonucleotides. The content of this application 
is incorporated by reference. 

It has been mentioned above that the mass labels of the present invention may comprise 
reactive ftmctionalities, Re, to help attach them to analytes. In preferred embodiments of 
the present invention, Re is a reactive ftmctionality or group which allows the mass label 
to be reacted covalently to an appropriate functional group in an analyte molecule, such 
as, but not limited to, a nucleotide oligonucleotide, polynucleotide, amino acid, peptide or 
polypeptide. Re may be attached to the mass labels via a linker which may or may not be 
cleavable. A variety of reactive functionalities may be introduced into the mass labels of 
this invention. 

Table 1 below lists some reactive functionalities that may be reacted with nucleophilic 
functionalities which are found in biomolecules to generate a covalent linkage between 
the two entities. For applications involving synthetic oligonucleotides, primary amines or 
thiols are often introduced at the termini of the molecules to permit labelling. Any of the 
functionalities listed below could be introduced into the compounds of this invention to 
permit the mass markers to be attached to a molecule of interest A reactive functionality 
can be used to introduce a further linker groups with a further reactive functionality if that 
is desired. Table 1 is not intended to be exhaustive and the present invention is not 
limited to the use of only the listed functionalities. 
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Table 1 



Nucleophilic Functionality 


Reactive Functionality 


Resultant Uniting Group 


PITT 

-SH 


C(V. /~TJ— / r "T>,» 
-OVJ 2 -V^Xl == V-/Jt\ 2 




-NH 2 


-S0 2 -CH=CR 2 


-N(CR 2 -CH 2 -S0 2 -) 2 or 




-NH-O^-CH^-SO^ 


-NH 2 




-CO-NH- 




V 




-NH 2 


2 

— t-0-l( > 






H 
O 




-NH 2 


-NCO 


-NH-CO-NH- 


-NH 2 


-NCS 


-NH-CS-NH- 


-NH 2 


-CHO 


-CH 2 -NH- 


-NH 2 


-sc^ci 


-S0 2 -NH- 


-NH 2 


-CH=CH- 


-NH-CH 2 -CH 2 - 


-OH 


-OPCNCHCCHs)^ 


-0P(=O)(0)O- 



It should be noted that in applications involving labelling oligonucleotides with the mass 
markers of this invention, some of the reactive functionalities above or their resultant 
linking groups might have to be protected prior to introduction into an oligonucleotide 
synthesiser. Preferably unprotected ester, thioether and thioesters, amine and amide 
bonds are to be avoided, as these are not usually stable in an oligonucleotide synthesiser. 
A wide variety of protective groups is known in the art which can be used to protect 
linkages from unwanted side reactions. 

In the discussion below reference is made to "charge carrying functionalities" and 
solubilising groups. These groups may be introduced into the mass labels such as in the 
mass markers of the invention to promote ionisation and solubility. The choice of 
markers is dependent on whether positive or negative ion detection is to be used. Table 2 
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below lists some functionalities that may be introduced into mass markers to promote 
either positive or negative ioriisatioiL The table is not intended as an exhaustive list, and 
the present invention is not limited to the use of only the listed functionalities. 



Table 2 



Positive Ion Mode 


Negative Ion Mode 


-NH 2 


-SO3- 


-m.2 


-PO4- 


-NR 3 + 


-PO3- 


**• 

— N-C 


-co 2 - 




-o 




-SR 2 + 





WO 00/02893 discloses the use of metal-ion binding moieties such as crown-ethers or 
porphyrins for the purpose of improving the ionisation of mass markers. These moieties 
are also be applicable for use with the mass markers of this invention. 



The components of the mass markers of this invention are preferably fragmentation 
resistant so that the site of fragmentation of the markers can be controlled by the 
introduction of a linkage that is easily broken by Collision Induced Dissociation (QD). 
Aryl ethers are an example of a class of fragmentation resistant compounds that may be 
used in this invention. These compounds are also chemically inert and thermally stable. 
WO 99/32501 discusses the use of poly-ethers in mass spectrometry in greater detail and 
the content of this application is incorporated by reference. 

In the past, the general method for the synthesis of aryl ethers was based on the Ullmann 
coupling of arylbromides with phenols in the presence of copper powder at about 200°C 
(representative reference: H. Stetter, G. Duve, Chemische Berichte 87 (1954) 1699). 
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Milder methods for the synthesis of aryl ethers have been developed using a different 
metal catalyst but the reaction temperature is still between 100 and 120°C. (ML Iyoda, M. 
Sakaitani, H. Otsuka, M. Oda, Tetrahedron Letters 26 (1985) 477). This is a preferred 
route for the production of poly-ether mass labels. See synthesis of FT77 given in Hie 
examples below. A recently published method provides a most preferred route for the 
generation of poly-ether mass labels as it is carried out under much milder conditions than 
the earlier methods (D. E. Bvans, J. L. Kate, T. R- West, Tetrahedron Lett 39 (1998) 
2937). 

The present invention also provides a set of two or more probes, each probe in the set 
being different and being attached to a unique mass label or a unique combination of mass 
labels, from a set or an array of mass labels as defined as defined above. 

Further provided is an array of probes comprising two or more sets of probes, wherein 
each probe in any one set is attached to a unique mass label, or a unique combination of 
mass labels, from a set of mass labels as defined above, and wherein the probes in any one 
set are attached to mass labels from the same set of mass labels, and each set of probes is 
attached to mass labels from unique sets of mass labels from an array of mass labels as 
defined above. 

In one emlk)diment, each probe is preferably attached to a unique combination of mass 
labels, each combination being distinguished by the presence or absence of each mass 
label in the set of mass labels and/or the quantity of each mass label attached to the probe. 
This is termed the "mixing mode" of the present invention, since the probes may be 
attached to a mixture of mass labels. 

In the above aspects, the nature of the probe is not particularly limited However, 
preferably each probe comprises a biomolecule. Any biomolecule can be employed, but 
the biomolecule is preferably selected from a DNA, an KNA, an oligonucleotide, a 
nucleic acid base, a peptide, a polypeptide, a protein and an amino acid 
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In one preferred embodiment, this invention provides sets and arrays of mass labelled 
analytes, such as nucleotides, oligonucleotides and polynucleotides, of lie form: 

analyte- linker -label 

Wherein the linker is a linker as defined above, and label is a mass label from any of the 
sets and arrays defined above. 

In the above aspect, the nature of the analyte is not particularly limited. However, 
preferably each analyte comprises a biomolecule. Any biomolecule can be employed, but 
the biomolecule is preferably selected from a DNA, an RNA, an oligonucleotide, a 
nucleic acid base, a peptide, a polypeptide, a protein and an amino acid. 

In one embodiment, each analyte is preferably attached to a unique combination of mass 
labels, each combination being distinguished by the presence or absence of each mass 
label in the set of mass labels and/or the quantity of each mass label attached to the probe. 
As mentioned above, this is termed the "mixing mode" of the present invention, since the 
probes may be attached to a mixture of mass labels. 

As mentioned above, the present invention provides a method of analysis, which method 
comprises detecting an analyte by identifying by mass spectrometry a mass label or a 
combination of mass labels unique to the analyte, wherein the mass label is a mass label 
from a set or an array of mass labels as defined above. The type of method is not 
particularly limited, provided that the method benefits from the use of the mass labels of 
the present invention to identify an analyte. The method may be, for example, a method 
of sequencing nucleic acid or a method of profiling the expression of one or more genes 
by detecting quantities of protein in a sample. The method is especially advantageous, 
since it can be used to readily analyse a plurality of analytes simultaneously. However, 
the method also has advantages for analysing single analytes individually, since using the 
present mass labels, mass spectra which are cleaner than conventional spectra are 
produced, making the method accurate and sensitive. 
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In a further preferred embodiment, the present invention provides a method which method 
comprises: 

(a) contacting one or more analytes with a set of probes, or an array of probes, 
each probe in the set or array being specific to at least one analyte, wherein the probes are 
as defined above, 

(b) identifying an analyte, by detecting the probe specific to that analyte. 

In this embodiment it is preferred that the mass label is cleaved from the probe prior to 
detecting the mass label by mass spectrometry. 

The nature of the methods of this particular embodiment is not especially limited 
However, it is preferred that the method comprises contacting one or more nucleic acids 
with a set of hybridisation probes. The set of hybridisation probes typically comprises a 
set of up to 256 4-mers, each probe in the set having a different combination of nucleic 
acid bases. This method may be suitable for identifying the presence of target nucleic 
acids, or alternatively can be used in a stepwise method of primer extension sequencing of 
one or more nucleic acid templates. 

The mass labels of the present invention are particularly suitable for use in methods of 
2-dimensional analysis, p rimari ly due to the large number of labels that can be 
simultaneously distinguished The labels may thus be used in a method of 2-dimensional 
gel electrophoresis, or in a method of 2-dimensional mass spectrometry. 

Peptide Synthesis 

The synthesis of many examples of the peptide mass tags of this invention will be possible 
using conventional peptide synthesis methods and commercially available reagents. 
Modified amino acids that are not commercially available are also contemplated for the . 
synthesis of further peptide mass tags. 
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Modern peptide synthesis is typically earned out on solid phase supports in automated 
synthesiser instruments, which deliver all the necessary reagents for each step of a peptide 
synthesis to the solid support and remove spent reagents and unreacted excess reagents at 
the end of each step in the cycle. Solid phase peptide synthesis is, however, often 
performed manually, particularly when specialist reagents are being tested forihe first 
time. In essence peptide synthesis involves the addition of N-protected amino acids to the 
solid support. The peptide is normally synthesised with the C-teiminal carboxyl group of 
the peptide attached to the support, and the sequence of the peptide is built from the C- 
terminal amino acid to the N-terminal amino acid. The C-terminal amino acid is coupled 
to the support by a cleavable linkage. The N-protected alpha amino group of each amino 
acid is deprotected to allow coupling of the carboxyl group of the next amino acid to the 
growing peptide on the solid support For most purposes, peptide synthesis is performed 
by one of two different synthetic procedures, which are distinguished by the conditions 
needed to remove the N-protecting group. The tert-butyloxycarbonyl (t-BOQ group is 
cleaved by mildly acidic conditions, e.g. trifluoroacetic acid in dichloromethane, while the 
fluorenylmethoxycarbonyl (EMOQ group is cleaved by mildly basic conditions, e.g. 20% 
piperidine in dimethylformamide. Reactive side chains in amino acids also need 
protection during cycles of amide bond formation. These side chains include the epsilon 
amino group of lysine, the guanidino side-chain of arginine, the thiol functionality of 
cysteine, the hydroxyl functionalities of serine, threonine and tyrosine, the indole ring of 
tryptophan and the imidazole ring of histidine. The choice of protective groups used for 
side-chain protection is determined by the cleavage conditions of the alpha-amino 
protection groups, as the side-chain protection groups must be resistant to the 
deprotection conditions used to remove the alpha-amino protection groups. A first 
protective group is said to be Orthogonal' to a second protective group if the first 
protective group is resistant to deprotection under the conditions used for the deprotection 
of the second protective group and if the deprotection conditions of the first protecting 
group do not cause deprotection of the second protecting group. 

Examples of side-chain protection groups compatible with FMOC syntheses are shown in 
Table3. 
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Table 3 



Side Chain 


Protective Group 


Epsilon amino group of lysine 


t-BOC group 


Guatudino-functionality of arginine 


Nitro group or 2,2,5,7>8- 
pentamefhylchroman-6-sulphonyl group 


Imidazole ring of histidine 


T-Trityl group, % - benzyloxymethyl (Bom) 
group. 


Hydroxy! functionalities of serine, 
threonine and tyrosine 


Tert-butyl group 


Indole ring of tryptophan 


t-BOC 


Thiol functionality of cysteine 


trityl or benzyl group 


Amide functionalities of glutamine and 
asoaragine 


Not usually necessary but Trityl group can 
be used for example. 


Carboxylic acid functionalities of glutamic 
acid and aspartic acid. 


Tert-butyl group 


Thioether of methionine 


Sometimes protected as sulphoxide 



Other side-chain protective groups that are orthogonal to EMOC protection will be known 
to one of ordinary skill in the art and may be applied with this invention (see for example 
Fields GJB. & Noble ILL., Iht J Pept Protein Res 35(3): 161-214, "Solid phase peptide 
synthesis utilizing 9-fluorenylmethoxycaibonyl amino acids." 1990). 



Protection groups for reactive side-chain functionalities compatible with t-BOC synthesis 
are shown below in Table 4. 
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Table 4 



Side Chain 


Jrrotecnve ixroup 


Epsilon amino group of lysine 


Benzyloxycafbonyl (Z) group 


Guamdjno-functionality of arginine 


Not usually necessary but nitration can be 
used 


Imidazole ting of histidine 


% — benzyloxymethyl (Bom) group. 


Hydroxyl functionalities of serine, 
threonine and tyrosine 


Benzyl group 


Hydroxyl functionality of tyrosine 


2-Bromobenzyloxycarbonyl group 


Indole ring of tryptophan 


Not usually necessary 


Thiol functionality of cysteine 


Benzyl group 


Amide functionalities of glutamine and 
asparagine 


Not usually necessary 


Carboxylic acid functionalities of glutamic 
acid and aspartic acid. 


Benzyl ester group 


Thioelher of methionine 


Sometimes protected as sulphoxide 



Again, the practitioner of ordinary skill in the art will be aware of other protective groups 
for use with reactive side chains that are orthogonal to t-BOC alpha amino protection 
Various different solid supports and resins are commercially available for peptide 
synthesis using either the FMOC or t-BOC procedures (for a review of solid supports see 
Meldal. M., Methods Bnzymol 289: 83-104, "Properties of solid supports." 1997). 



Mass modified amino acids* 

A variety of amino acids can be used in the mass marker moiety and the mass 
normalisation moiety. Neutral amino acids are preferred in the mass normalisation 
moiety and charged amino acids are preferred in the mass marker moieties (since this 
facilitates ionisation and increases sensitivity) e.g. in the position marked amino acid 1 
and amino acid 2 in the first and fourth embodiments of this invention. A number of 
commercially available isotopically mass modified amino acids are shown in Table 5 
below. Any combination of 1, 2 ,3, or 4 or more amino acids from this list are preferred 
in each of the moieties according to the present invention. 
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table 5 



Amino acid 


Isotope Forms 


Alanine 


CB*CBQ^ U CQ& CHsCDCNH^COiH, 
CH 3 13 CH( 15 NH 2 )C0 2 H, CD 3 CH(NH 2 )C0 2 H, CDsCDCNH^COiH, 
tt 3 CR(NH. 2 ) C0 2 H, CD 3 CH(^2)C0 2 H, 
CH 3 CH( TnH 2 ) C0 2 H 


Axginine 


rf 1 ^2)2GNHCH,CH 2 CH(NH,.)C0 2 Hl + 


Asparagine 


HaN^COCHzCHCNH^COzH, ^N^CO^CHj^CHCNH^^COjH, 
H 2 l ^COOT 2 CH(NH 2 )C0 2 H ) H 2 1 ^COCH 2 OT 15 NH2)C0 2 H, 


Aspartic Acid 


H0 2 l3 CCH 2 CH(NH 2 )C0 2 H, H0 2 C 1:, CH 2 CH(NH 2 )C0 2 H, 
H0 2 CCH 2 CH(NH2) lS C0 2 H, H0 2 13 CCH 2 CH(NH 2 ) 13 C0 2 H > 
H0 2 COI 2 l3 CH(NH 2 ) 13 C0 2 H, H0 2 13 C^OkCH^OCC^H, 
HO 2 13 C 13 CH 2 13 CH(NH 2 ) 13 C0 2 H, HC^CCDzCDCNH^COaH, 
HO 2 CCH 2 CHC l ^ 2 )C0 2 H,HO 2 CCH 2 CH( 15 NH 2 ) 13 C0 2 H 


Cysteine 


Not available 


Glutamic Acid 


HOzCCHjCHjCHCNH^^COaH, HOaCCHiCHj^CHOSIH^COaH, 
HQjCCHz^CHzCHCNH^COaH, H0 2 C 13 CH 2 CH 2 (H(NH2)C0 2 H, 
HO 2 13 CCH 2 CH 2 CH(NH 2 )C0 2 H, 
HOz^C^CHa^OIa^CHCNHO^COjH, 

H02CGD 2 CH 2 CH(NH2)C0 2 H, HOzCCDaCDiCDCNH^CO^, 
H0 2 13 C l3 CH 2 13 CH 2 13 CH( l5 NH 2 ) 13 C0 2 H 


Glutamine 


HzNCOCHiCHaCHCNH^^COiH, 

HzN^COCHaCHzCHCNH^COaH, 

H 2 NCOC3> 2 CD 2 CDCNH 2 )C0 2 H, 

H 2 1 ^SrCOCH 2 CH 2 CE[flSIH 2 )C0 2 ^ 

HzNCOCHaCHjCHO^NH^COaH, 

H 2 15 NCOCH 2 CH 2 CH( 15 NH 2 )C0 2 H, 

H 2 15 N 13 C0 13 CH 2 13 CH 2 13 CH( I5 NH 2 ) I3 C0 2 H 


Glycine 


H 2 NCH 2 13 C0 2 H, H 2 N 13 CH 2 C0 2 H, H 2 N 13 CH 2 "C0 2 H, 
H 2 NCD 2 C0 2 H, H 2 15 NCH 2 C0 2 H, H 2 15 N 13 CH 2 C0 2 H, 
H 2 15 NCH 2 13 C0 2 H, H 2 15 N l3 CH 2 13 C02H 


Histidine 


(CEO^CCHjCHCNH^^COzH, (CH) 2 N 2 CCH 2 CH( 15 NH 2 )C0 2 H, 
(CH) 2 15 N 2 CCH 2 C3I(NH 2 )C0 2 H 


Isoleucine 


Not available 


Leucine 


(CHa^CHCHaCHCNH^^COjH, (CH 3 ) 2 CHCH 2 1J CH(NH2)C0 2 H, 
(aH 3 ) 2 CHCH 2 13 CH(NH 2 ) 13 C0 2 H 5 (CH 3 ) 2 CHCH 2 CD(NH 2 }C0 2 H, 
(CH 3 ) 2 CHCD 2 CD(NH 2 )(X) 2 H, (CD 3 )(CH3)CHCH 2 CH(NH 2 ) 
C0 2 H, (CD 3 ) 2 CDCH 2 CH^H 2 )C0 2 H, 
(CD^CDCDzCDCNH^COjH, (CHs^CHCHaCHO^NH^COaH, 


Lysine 


HzNCHaCHjCHjCHaCHfNHO^CX)^ ! 

H 2 NCH 2 CT 2 CH 2 CT2 13 CH(NH 2 )C0 2 H, 

H 2 N 13 CH 2 CH 2 CH 2 CH 2 CH(NH 2 )C0 2 H J 

HjNCHaCHzCHaCHz^CHCNH^^COzH, 

HzNCHz^CDjCHjCE^NH^CO^, 

H 2 NCD 2 CD2CI>2CD2CH(NH 2 )C02H J 



! 
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H 2 NCH 2 CH 2 CH 2 CH 2 CH( 13 NH 2 )C0 2 H, 
Hj^GHzCHzCHjCHaCHCNH^COA 
H 7 15 N 13 CH 2 CH,CH,CH,.CH(NH,)C0 2 H 


Methionine 


CH3SCH 2 CH 2 CH(NH 2 ) l3 C0 2 H, CH 3 SCH 2 CH 2 u CH(NH 2 )C0 2 H, 
"CHsSCHzCHjCHCNH^COzH, CH 3 SCH 2 CH 2 CDmH 2 )C0 2 H, 
CD 3 SCH 2 CH 2 CH(NH 2 )C0 2 H, CHaSCHa^CE^^NH^COzH, 


Phenylalanine 


<&&B£HQW"O0& QHW'CHC^COaH, 
13 C6H 5 CH 2 CH(NH2)C0 2 H, C 6 H 5 CH 2 CD(NH 2 )C0 2 H, 
C6H 5 CD 2 CH(NH 2 )C0 2 H, CfiDsCHzCHCNH^COjH, 
C sDsCDjCDOSIH^CaH, CfiHsC&OT^^COJE 


Proline 


I >— CO2" I / /^ uu 2n 

cV" c ° 2H «0" c ° 2H 


Serine 


HOCH^CNH^^CO^ HOCH 2 u CHmH 2 )C0 2 H, 
H0 13 CH 2 CH(NH 2 )CX) 2 H, HOCH 2 CH( l5 NH 2 )C0 2 H, 
HOCH 2 13 CH( 15 NH 2 )C0 2 H 


Threonine 


CH 3 CH(OH)CH(NH2) C0 2 H 


Tryptophan 


\ ^ch^NH, 

\ lop 


Tyrosine 


HO(C6H4)CH 2 CH(NH 2 ) u C0 2 H, HO(C^)CH 2 1J CH(NH 2 )C0 2 H, 

HO(C 6 H 4 ) 13 CH 2 13 CH(NH 2 ) 13 C0 2 H, 
HOC^C^CHjCHCNHOCOaH, 

HOCCcH^CEjCHC^COzH, HOCCeDjH^CHzCHfNH^COA 
HO(C«D4)CH 2 CH(NH 2 )C0 2 H, HOCQHOCHzCHC^^COzH, 
H l7 0(C6H4)CH 2 CH(NH 2 )C0 2 H, H^OCC^CHjCHCNHaJCOzH, 
HOCC^CH^CHC^^CCbH, 
HO( 13 C^) u CH 2 u CH( 15 NH 2 ) 13 C0 2 H 


Valine 


(CHs^CHCHOSIH^^COaH, (CH 3 )2CH^CH(NH 2 )C0 2 H, 
(^^CHCDCNH^COzH, (CD 3 )2CDCD(NH 2 )C0 2 H, 
(CH 3 ) 2 CHCH( 15 NH 2 )C0 2 H 



/ 
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For many of the above amino acids, both the D- and L- forms are available (from 
ISOTEC Inc., Miamisburg, Ohio for example), either of which may be used in the 
preparation of the tags of this invention Mixtures of D and L forms are also available but 
are less preferred if the tags of this invention are to be used in chromatographic 
separations. For some, FMOC or t-BOC protected derivatives are also available. Mass 
modified amino acids based on substitution of deuterium for hydrogen and on substitution 
of 13 C and 15 N isotopes for 12 C and 13 N isotopes are also available and are equally 
applicable for the synthesis of the tags of this invention. Various amino acids that are not 
typically found in peptides may also be used in the tags of this invention, for example 
deuterated forms of ammo-butyric acid are commercially available. For the purposes of 
this invention non-radioactive, stable isotopes are preferred for safety reasons but there is 
no necessary limitation to stable isotopes. 

Fluorinated derivatives of a number of amino acids are also available. Some of the 
commercially available fluorinated amino acids are shown in Table 6 below. 



Table 6 



Amino acid 


Fluorinated Forms 


Glutamic Acid 


H0 2 CCFHCH 2 CH(NH 2 )C0 2 H 


Leucine 


fCH,)CCF 3 )CHCH 2 CH(NH 2 )C0 2 H 


Phenylalanine 


CsF^CHzCHCNHaJCC^H, Q&HaCHzCHOSIH^COzH, 
C«FJH 2 CH 2 CH(NH 2 )C0 2 H 


Phenylglycine 


C 6 FH 4 CH(NH 2 )C0 2 H, CsFaHsCHCNH^COzH, 
C^H 2 CH(NH 2 )C0 2 H 


Valine 


(CH 3 ) 2 CFCH(NH 2 )C0 2 H 



For most of the above fluorinated amino acids, the reagents are available as mixtures of D 
and L forms. In general, fluorinated variants of amino acids are less preferred than 
isotope substituted variants. The fluorinated compounds can be used to generate a range 
of mass tags with the same mass but each tag will be chemically different, which means 
that their behaviour in the mass spectrometer will vary more than isotope substituted tags. 
Moreover, the tags will not have identical chromatographic properties if the tags are to be 
used in chromatographic separations. 
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Reactive Functionalities 

In some aspects of this invention, as already explained, the mass tags of the invention 
comprise a reactive functionality. In the simplest embodiments this may be an N- 
hychoxysurcinimide ester introduced by activation of the <>terminus of the tag peptides 
of this invention. In conventional peptide synthesis, this activation step would have to 
take place after the peptide mass tag has been cleaved from the solid support used for its 
synthesis. An N-hydroxysuccinimide activated peptide mass tag could also be reacted 
with hydrazine to give a hydrazide reactive functionality, which can be used to label 
periodate oxidised sugar moieties, for example. Ammo-groups or thiols can be used as 
reactive functionalities in some applications and these may be introduced by adding lysine 
or cysteine after amino acid 2 of the tag peptide. Lysine can be used to couple tags to free 
carboxyl functionalities using a carbodiimide as a coupling reagent Lysine can also be 
used as the starting point for the introduction of other reactive functionalities into the tag 
peptides of this invention. The thiol-reactive maleimide functionality can be introduced 
by reaction of the lysine epsilon amino group with maleic anhydride. The cysteine thiol 
group can be used as the starting point for the synthesis of a variety of alkenyl sulphone 
compounds, which are useful protein labelling reagents that react with thiols and amines. 
Compounds such as anrinohexanoic acid can be used to provide a spacer between the 
mass modified amino acids and the reactive functionality. 

Affinity Capture Ligands 

In certain embodiments of the first aspect of Ms invention the mass markers comprise an 
affinity capture ligand. Affinity capture ligands are ligands, which have highly specific 
binding partners. These binding partners allow molecules tagged with the ligand to be 
selectively captured by the binding partner. Preferably a solid support is derivitised with 
the binding partner so that affinity ligand tagged molecules can be selectively captured 
onto the solid phase support A preferred affinity capture ligand is biotin, which can be 
introduced into the peptide mass tags of this invention by standard methods known in the 
art In particular a lysine residue may be incorporated after amino acid 2 through which 
an amine-reactive biotin can be linked to the peptide mass tags ( see for example Geahlen 
RX. et al., Anal Biochem 202(1): 68-67, "A general method for preparation of peptides 
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biotinylated at the carboxy terminus." 1992; Sawutz D.G. et al., Peptides 12(5): 1019- 
1012, "Synthesis and molecular characterization of a biotinylated analog of 
[Lys]bradykiDin/ , 1991; Natarajan S. et al., M J Pept Protein Res 40(6): 567-567, "Site- 
specific biotinylation A novel approach and its application to endothelin-1 analogs and 
PTH-analog.", 1992). Iminobiotin is also applicable. A variety of avidin counter-ligands 
for biotin are available, which include monomelic and tetrameric avidin and streptavidin, 
all of which are available on a number of solid supports. 

Other affinity capture ligands include digoxigenin, fluorescein, nitrophenyl moieties and a 
number of peptide epitopes, such as the c-myc epitope, for which selective monoclonal 
antibodies exist as counter-ligands. Metal ion binding ligands such as hexahistidine, 
which readily binds Ni 2+ ions, are also applicable. Chromatographic resins, which present 
iminodiacetic acid chelated Ni*" ions are commercially available, for example. These 
immobilised nickel columns may be used to capture peptide mass tags, which comprise 
oligomeric histidine. As a further alternative, an affinity capture functionality may be 
selectively reactive with an appropriately derivitised solid phase support Boronic acid, 
for example, is known to selectively react with vicinal cis-diols and chemically similar 
ligands, such as salicylhydroxamic acid Reagents comprising boronic acid have been 
developed for protein capture onto solid supports derivitised with salicylhydroxamic acid 
(Stolowitz M.L.. et al., Bioconjug Chem 12(2): 229-239, "Phenylboronic Acid- 
Salicylhydroxamic Acid Bioconjugates. 1. A Novel Boronic Acid Complex for Protein 
Immobilization." 2001; Wiley J.P. et aL, Bioconjug Chem 12(2): 240-250, 
Phenylboronic Acid-Salicylhydroxamic Acid Bioconjugates. 2. Polyvalent 
Immobilization of Protein Ligands for Affinity Chromatography." 2001, Prolinx, Inc, 
Washington State, USA). It is anticipated that it should be relatively simple to link a 
phenylboronic acid functionality to a peptide mass tag according to this invention to 
generate capture reagents that can be captured by selective chemical reactions. The use of 
this sort of chemistry would not be directly compatible with biomolecules bearing vicinal 
cis-diol-containing sugars, however these sorts of sugars could be blocked with 
phenylboronic acid or related reagents prior to reaction with boronic acid derivitised 
peptide mass tag reagents. 
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Mass Spec Sensitivity Enhancing Groups and Mass Differentiation 
In preferred embodiments of the first and fourth aspects of this invention the peptide mass 
tags comprise Sensitivity Enhanc ing Groups. Figures 1 to 5 illustrate the use of 
methylation and guanidination as methods of improving sensitivity. In addition, these 
Sensitivity Enhancing Groups can differentiate the fragmentation products of the N- 
terminal amino acid from the fragmentation products of the second amino acid in the 
peptide tag and natural amino acid residues in the protein, if this is the same as the first 
amino acid. The sensitivity enhancing group can also distinguish the fragmentation 
products of the N-terminal amino acid of the peptide mass tag from the fragmentation 
products of natural amino acids when.the tags of this invention are used to label peptides 
and proteins. The guanidino group and the tertiary amino group are both useful 
Sensitivity Enhancing Groups for electrospray mass spectrometry. 

Various other methods for derivatising peptides have been also been developed. These 
include the use of quaternary ammonium derivatives, quaternary phosphonium derivatives 
and pyridyl derivatives for positive ion mass spectrometry. Halogenated compounds, 
particularly halogenated aromatic compounds are well known electrophores, i.e. they pick 
up thermal electrons very easily. A variety of derivatisation reagents based on fluorinated 
aromatic compounds (Bian N. et al., Rapid Commun Mass Spectrom 11(16): 1781-1784, 
'^Detection via laser desoiptidn and mass spectrometry of multiplex electrophore-labelled 
albumin." 1997) have been developed for electron capture detection, which is a highly 
sensitive idnisation and detection process that can be used with negative ion mass 
spectrometry (Abdel-Baky S. & Giese R.W., Anal Chem 63(24):2986-2989, "Gas 
chromatography/electron capture negative-ion mass spectrometry at the zeptomole level." 
1991). A fluorinated aromatic group could also be used as a sensitivity enhancing group. 
Aromatic sulphonic acids have also been used for improving sensitivity in negative ion 
mass spectrometry. 

Bach type of Sensitivity Enhancing Group has different benefits, which depend on the 
method of ionisation used and on the methods of mass analysis used. The mechanism by 
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which sensitivity is enhanced may also be different for each type of group. Some 
derivitisation methods increase basicity and thus promote protonation and charge 
localisation, while other methods increase surface activity of the tagged peptides, which 
improves sensitivity in surface desorption techniques like Matrix Assisted Laser 
Desorption Ionisatibn (MALDI) and Fast Atom Bombardment (FAB). Negative ion mass 
spectrometry is often more sensitive because there is less background noise. Charge 
derivitisation can also change the fragmentation products of derivatised peptides, when 
collision induced dissociation is used, hi particular some derivatisation techniques 
simplify fragmentation pattens, which is highly advantageous. The choice of Sensitivity 
Enhancing Group is determined by the mass spectrometry techniques that will be 
employed (for a review see Roth et al., Mass Spectrometry Reviews 17:255-274, "Charge 
derivatisation of peptides for analysis by mass spectrometry", 1998). For the purposes of 
this invention all of the known derivatisation techniques could be used with the peptide 
mass tags of this invention. The published protocols could be used without modification 
to derivitise the peptide mass tags of this invention after solid phase peptide synthesis or 
the protocols could be readily adapted for use during solid phase synthesis if desired. 

Analysis of peptides by mass spectrometry 

Hie essential features of a mass spectrometer are as follows: 

Inlet System -> Ion Source -> Mass Analyser -> Ion Detector -> Data Capture System 

There are preferred inlet systems, ion sources and mass analysers for the purposes of 
analysing peptides. 

Inlet Systems 

In the second aspect of this invention a chromatographic or electrophoretic separation is 
preferred to reduce the complexity of the sample prior to analysis by mass spectrometry. 
A variety of mass spectrometry techniques are compatible with separation technologies 
particularly capillary zone electrophoresis and High Performance Liquid Chromatography 
(HPLQ. The choice of ionisation source is limited to some extent if a separation is 
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required as ionisation techniques such as MALDI and FAB (discussed below) which 
ablate material from a solid surface are less suited to chromatographic separations. For 
most purposes, it has been very costly to link a chromatographic separation in-line with 
mass spectrometric analysis by one of these techniques. Dynamic FAB and ionisation 
techniques based on spraying such as electrospray, thermospray and APCI are all readily 
compatible with in-line chromatographic separations and equipment to perform such 
liquid chromatography mass spectrometry analysis is commercially available. 

Ionisation techniques 

For many biological mass spectrometry applications so called f sofF ionisation techniques 
are used. These allow large molecules such as proteins and nucleic acids to be ionised 
essentially intact The liquid phase techniques allow large biomolecules to enter the mass 
spectrometer in solutions with mild pH and at low concentrations. A number of 
techniques are appropriate for use with this invention including but not limited to 
Blectrospray Ionisation Mass Spectrometry (BSI-MS), Fast Atom Bombardment (FAB), 
Matrix Assisted Laser Desorption Ionisation Mass Spectrometry (MALDI MS) and 
Atmospheric Pressure Chemical Ionisation Mass Spectrometry (APCI-MS). 

Electrospray Ionisation 

Blectrospray ionisation requires that the dilute solution of the analyte molecule is 
"atomised 1 into the spectrometer, i.e. injected as a fine spray. The solution is, for example, 
sprayed from the tip of a charged needle in a stream of dry nitrogen and an electrostatic 
field The mechanism of ionisation is not fully understood but is thought to work broadly 
as follows. In a stream of nitrogen the solvent is evaporated With a small droplet, this 
results in concentration of the analyte molecule. Given that most biomolecules have a net 
charge this increases the electrostatic repulsion of the dissolved molecule. As evaporation 
continues this repulsion ultimately becomes greater than the surface tension of the droplet 
and the droplet disintegrates into smaller droplets. This process is sometimes referred to 
as a 'Ctoulombic explosion'. The electrostatic field helps to further overcome the surface 
tension of the droplets and assists in the spraying process. The evaporation continues 
from the smaller droplets which, in turn, explode iteratively until essentially the 
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biomolecules are in the vapour phase, as is all the solvent This technique is of particular 
importance in the use of mass labels in that the technique imparts a relatively small 
amount of energy to ions in the ionisation process and the energy distribution within a 
population tends to fall in a narrower range when compared with other techniques. The 
ions are accelerated out of the ionisation chamber by the use of electric fields that are set 
up by appropriately positioned electrodes. The polarity of flie fields may be altered to 
extract either negative or positive ions. The potential difference between these electrodes 
determines whether positive or negative ions pass into the mass analyser and also the 
kinetic energy with which these ions enter the mass spectrometer. This is of significance 
when considering fragmentation of ions in the mass spectrometer. The more energy 
imparted to a population of ions the more likely it is that fragmentation will occur through 
collision of analyte molecules with the bath gas present in the source. By adjusting the 
electric field used to accelerate ions from the ionisation chamber it is possible to control 
the fragmentation of ions. This is advantageous when fragmentation of ions is to be used 
as a means of removing tags from a labelled biomolecule. Electrospray ionisation is 
particularly advantageous as it can be used in-line with liquid chromatography, referred to 
as Liquid Chromatography Mass Spectrometry (LC-MS). 

Matrix Assisted Laser Desorption Ionisation (MALDI) 

MALDI requires that the biomolecule solution be embedded in a large molar excess of a 
photo-excitable 'matrix'. The application of laser light of the appropriate frequency 
results in the excitation of the matrix which in turn leads to rapid evaporation of the 
matrix along with its entrapped biomolecule. Proton transfer from the acidic matrix to the 
biomolecule gives rise to protonated forms of the biomolecule which can be detected by 
positive ion mass spectrometry, particularly by Time-Of-Flight (TOF) mass spectrometry. 
Negative ion mass spectrometry is also possible by MALDI TOF. This technique imparts 
a significant quantity of translational energy to ions, but tends not to induce excessive 
fragmentation despite this. Accelerating voltages can again be used to control 
fragmentation with this technique though. 
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Fast Atom Bombardment 

Fast Atom Bombardment (FAB) has come to describe a number of techniques for 
vaporising and ionising relatively involatile molecules. In these techniques a sample is 
desorbed from a surface by collision of the sample with a high energy beam of xenon 
atoms or caesium ions. The sample is coated onto a surface with a simple matrix, 
typically a non volatile material, e.g. m-nitrobenzyl alcohol (NBA) or glycerol. FAB 
techniques are also compatible with liquid phase inlet systems - the liquid eluting from a 
capillary electrophoresis inlet or a high pressure liquid chromatography system pass 
through a frit, essentially coating the surface of the frit with analyte solution which can be 
ionised from the frit surface by atom bombardment 

Mass Analysers 

Fragmentation of peptides by collision induced dissociation is used in this invention to 
identify tags on proteins. Various mass analyser geometries may be used to fragment 
peptides and to determine the mass of the fragments. 

MS/MS and MST analysis of peptides 

Tandem mass spectrometers allow ions with a pre-detennined mass-to-charge ratio to be 
selected and fragmented by collision induced dissociation (CID). The fragments can then 
be detected providing structural information about the selected ion. When peptides are 
analysed by ODD in a tandem mass spectrometer, characteristic cleavage patterns are 
observed, which allow the sequence of the peptide to be determined. Natural peptides 
typically fragment randomly at the amide bonds of the peptide backbone to give series of 
ions that are characteristic of the peptide. CID fragment series are denoted a^ b^ Cq, etc. 
for cleavage at the n a peptide bond where the charge of the ion is retained on the N- 
terminal fragment of the ion. Similarly, fragment series are denoted Xa, y n , etc. where 
the charge is retained on the C-terminal fragment of the ion. 
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a b c 




z y x 



Trypsin and thrombin are favoured cleavage agents for tandem mass spectrometry as they 
produce peptides with basic groups at both ends of the molecule, i;e. the alpha-ammo 
group at the N-terminus and lysine or arginine side-chains at the C-terminus. This favours 
the formation of doubly charged ions, in which the charged centres are at opposite termini 
of the molecule. These doubly charged ions produce both C-terminal and N-terarinal ion 
series after CID. This assists in detennining the sequence of the peptide. Generally 
speaking only one or two of the possible ion series are observed in the CID spectra of a 
given peptide. In low-energy collisions typical of quadrupole based instruments the b- 
series of N-teiminal ■ fragments or the y-series of C-terminal fragments predominate. If 
doubly charged ions are analysed then both series are often detected. In general, the y- 
series ions predominate over the b-series. 

In general peptides fragment via a mechanism that involves protonation of the amide 
backbone follow by intramolecular nucleophilic attack leading to the formation of a 5- 
membered oxazolone structure and cleavage of the amide linkage that was protonated 
(Schlosser A. and Lehmann W.D. J. Mass Spectrom. 35: 1382-1390, 'Tive-membered 
ring formation in unimolecular reactions of peptides: a key structural element controlling 
low-energy collision induced dissociation", 2000). Figure 16a shows one proposed 
mechanism by which this sort of fragmentation takes place. This mechanism requires a 
carbonyl group from an amide bond adjacent to a protonated amide on the N-terarinal side 
of the protonated amide to carry out the nucleophilic attack. A charged oxazolonium ion 
gives rise to b-series ions, while proton transfer from the N-terminal fragment to the C- 
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terminal fragment gives rise to y-series ions as shown in figure 16a. This requirement for 
an appropriately located carbonyl group does not account for cleavage at amide bonds 
adjacent to the N-tenninal amino acid, when the N-terminus is not protected and, in 
general, b-series ions are not seen for the amide between the N-tenrrinal and second 
amino acid in a peptide. However, peptides with acetyiated N-termini do meet the 
structural requirements of this me chanism and fragmentation can take place at the amide 
bond immediately after the first amino acid by this mechanism. Peptides with 
thioacetylated N-termini, as shown in figure 16c, will cleave particularly easily by the 
bxazolone mechanism as the sulphur atom is more nucleophilic than an oxygen atom in 
the same position. Fragmentation of the amide backbone of a peptide can also be 
modulated by methylation of the backbone. Methylaiion of an amide nitrogen in a 
peptide can promote fragmentation of the next amide bond C-terminal to the methylated 
amide and also favours the formation of b-ions. The enhanced fragmentation may be 
partly due to the electron donating effect of the methyl group increasing the 
nucleophilicity of the carbonyl group of the methylated amide, while the enhanced 
formation of b-ions may be a result of the inability of the oxazolonium ion that forms to 
transfer protons to the C-terminal fragment as shown in figure 16b. In the context of this 
invention thioacetylation of the N-terminus of a tag dipeptide can be used to enhance 
cleavage of the tag peptide at the next amide bond. Similarly, methylation of the nitrogen 
atom of an N-tenninal acetyl or thioacetyl group will also enhance cleavage of the 
adjacent amide bond. Figures 17a and 17b illustrate pairs of tags that exploit these 
methods of enhancing cleavage at the marked amide linkage. 

The ease of fragmentation of the amide backbone of a polypeptide or peptide is also 
significantly modulated by the side chain functionalities of the peptide. Thus the 
sequence of a peptide determines where it will fragment most easily. la general it is 
diffic ult to predict which amide bonds will fragment easily in a peptide sequence. This 
has important consequences for the design of the peptide mass tags of this invention. 
However, certain observations have been made that allow peptide mass tags that fra gment 
at the desired amide bond to be designed. Proline, for example, is known to promote 
fragmentation at its N-tenninal amide bond (Schwartz B.L., Bursey MM., Biol. Mass 
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Spectrom. 21:92, 1997) as fragmentation at the (>tenmnal amide gives rise to an 
energetically unfavourable strained "tricyclic oxazolone structure. Aspartic acid also 
promotes fragmentation at its N-tenninal amide bond Asp-Pro linkages, however, are 
particularly labile in low energy CID analysis (Wysocki V.H. et aL, J Mass Spectrom 
35(12): 1399-1406, "Mobile and localized protons: a framework for understanding 
peptide dissociation/' 2000) and in this situation aspartic acid seems to promote the 
cleavage of the amide bond on its C-terminal side. Thus proline, and asp-pro linkages can 
also be used in the tag peptides of this invention to promote fragmentation at specified 
locations within a peptide. Figures 17c and 17d illustrate pairs of tags that exploit these 
methods of enhancing cleavage at the marked amide linkage. Figure 17c illustrates a pair 
of tripeptide tags with the sequence alanine-proline-alanine. The proline linkage 
promotes cleavage at its N-terminal amide. This is enhanced by the presence of a 
thioacetyl protecting group at the N-terminus of the tripeptide and the cleavability is 
further enhanced by methylation of the N-terminal nitrogen. The tags have the same mass 
but in the first tag there is an alanine residue with heavy isotopes in the third position of 
the tripeptide while in the second tag there is an alanine residue with heavy isotopes in the 
first position of the tripeptide. Figure 17d illustrates a pair of tripeptide tags with the 
sequence aspartic acid-proline-alanine. The proline linkage promotes cleavage at its N- 
tenninal amide. This is enhanced by the presence of the aspartic acid residue. The N- 
terminus of the tripeptide is methylated to promote localised protonation here. The tags 
have the same mass but in the first tag there is an alanine residue with heavy isotopes in 
the third position of the tripeptide while in the second tag there is an aspartic acid residue 
with heavy isotopes in the first position of the tripeptide. 

A typical tandem mass spectrometer geometry is a triple quadrupole which comprises two 
quadrupole mass analysers separated by a collision chamber, also a quadrupole. This 
collision quadrupole acts as an ion guide between the two mass analyser quadrupoles. A 
gas can be introduced into the collision quadrupole to allow collision with the ion stream 
from the first mass analyser. The first mass analyser selects ions on the basis of their 
mass/charge ration which pass through the collision cell where they fragment The 
fragment ions are separated and detected in the third quadrupole. Induced cleavage can be 
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performed in geometries other than tandem analysers. Ion trap mass spectrometers can 
promote fragmentation through introduction of a gas into the trap itself with which 
trapped ions will collide. Ion traps generally contain a bath gas, such as helium but 
addition of neon for example, promotes fragmentation. Similarly photon induced 
fragmentation could be applied to trapped ions. Another favourable geometry is a 
Quadmpole/Orthogonal Time of Flight tandem instrument where the high scanning rate 
of a quadrupole is coupled to the greater sensitivity of a reflection TOF mass analyser to 
identify the products of fragmentation. 

Conventional 'sector 1 instruments are another common geometry used in tandem mass 
spectrometry. A sector mass analyser comprises two separate 'sectors 1 , an electric sector 
which focuses an ion beam leaving a source into a stream of ions with the same kinetic 
energy using electric fields. The magnetic sector separates the ions on the basis of their 
mass to generate a spectrum at a detector. For tandem mass spectrometry a two sector 
mass analyser of this kind can be used where the electric sector provide the first mass 
analyser stage, the magnetic sector provides the second mass analyser, with a collision 
cell placed between the two sectors. Two complete sector mass analysers separated by a 
collision cell can also be used for analysis of mass tagged peptides. 

Ion Traps 

Ion Trap mass analysers are related to the quadrupole mass analysers. The ion trap 
generally has a 3 electrode construction - a cylindrical electrode with 'cap' electrodes at 
each end forming a cavity. A sinusoidal radio frequency potential is applied to the 
cylindrical electrode while the cap electrodes are biased with DC or AC potentials. Ions 
injected into the cavity are constrained to a stable circular trajectory by .the oscillating 
electric field of the cylindrical electrode. However, for a given amplitude of the 
oscillating potential, certain ions will have an unstable trajectory and will be ejected from 
the trap. A sample of ions injected into the trap can be sequentially ejected from the trap 
according to their mass/charge ratio by altering the oscillating radio frequency potential. 
The ejected ions can then be detected allowing a mass spectrum to be produced 
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Ion traps are generally operated with a small quantity of a 'bath, gas 1 , such as helium, 
present in the ion trap cavity. This increases both the resolution and the sensitivity of the 
device as the ions entering the trap are essentially cooled to the ambient temperature of 
the bath gas through collision with the bath gas. Collisions both increase ionisation when 
a sample is introduced into the trap and dampen the amplitude and velocity of ion 
trajectories keeping them nearer the centre of the trap. This means that when the 
oscillating potential is changed, ions whose trajectories become unstable gain energy 
more rapidly, relative to the damped circulating ions and exit the trap in a tighter bunch 
giving a narrower larger peaks.. 

Ion traps can mimic tandem mass spectrometer geometries, in feet they can mimic 
multiple mass spectrometer geometries allowing complex analyses of trapped ions. A 
single mass species from a sample can be retained in a trap, i.e. all other species can be 
ejected and then the retained species can be carefully excited by super-imposing a second 
oscillating frequency on the first The excited ions will then collide with the bath gas and 
will fragment if sufficiently excited. The fragments can then be analysed further. It is 
possible to retain a fragment ion for further analysis by ejecting other ions and then 
exciting the fragment ion to fragment This process can be repeated for as long as 
sufficient sample exists to permit further analysis. It shoidd be noted that these 
instruments generally retain a high proportion of fragment ions after induced 
fragmentation. These instruments and FTICR mass spectrometers (discussed below) 
represent a form of temporally resolved tandem mass spectrometry rather than spatially 
resolved tandem mass spectrometry which is found in linear mass spectrometers. 

Fourier Transform Ion Cyclotron Resonance Mass Spectrometry (FTICR MS)- 
FTICR mass spectrometry has similar features to ion traps in that a sang>le of ions is 
retained within a cavity but in FTICR MS the ions are trapped in a high vacuum chamber 
by crossed electric and magnetic fields. The electric field is generated by a pair of plate 
electrodes that form two sides of a box. The box is contained in the field of a 
superconducting magnet which in conjunction with the two plates, the trapping plates, 
constrain injected ions to a circular trajectory between the trapping plates, perpendicular 
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to the applied magnetic field. The ions are excited to larger orbits by applying a radio- 
frequency pulse to two 'transmitter plates 1 which form two further opposing sides of the 
box. The cycloidal motion of the ions generate corresponding electric fields in the 
remaining two opposing sides of the box which comprise the 'receiver plates 1 . The 
excitation pulses excite ions to larger orbits which decay as the coherent motions of the 
ions is lost through collisions. The corresponding signals detected by the receiver plates 
are converted to a mass spectrum by Fourier Transform (FT) analysis. 

For induced fragmentation experiments these instruments can perform in a similar manner 
to an ion trap - all ions except a single species of interest can be ejected from the trap. A 
collision gas can be introduced into the trap and fragmentation can be induced The 
fragment ions can be subsequently analysed. Generally fragmentation products and bath 
gas combine to give poor resolution if analysed by FT analysis of signals detected by the 
■receiver plates 1 , however the fragment ions can be ejected from the cavity and analysed in 
a tandem configuration with a quadrupole, for example. 

Separation of labelled peptides by chromatography or electrophoresis 
In the optional second step of the second aspect of this invention, labelled biomolecules 
are subjected to a chromatographic separation prior to analysis by mass spectrometry. 
This is preferably High Performance Liquid Chromatography (HPLQ which can be 
coupled directly to a mass spectrometer for in-line analysis of the peptides as they elute 
from the chromatographic column. A variety of separation techniques may be performed 
by HPLC but reverse phase chromatography is a popular method for the separation of 
peptides prior to mass spectrometry. Capillary zone electrophoresis is another separation 
• method that may be coupled directly to a mass spectrometer for automatic analysis of 
eluting samples. These and other fractionation techniques may be applied to reduce the 
complexity of a mixture of biomolecules prior to analysis by mass spectrometry. 
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Applications of the invention 

Labelling peptides and polypeptides and analysis by LC-MS-MS 
In preferred embodiments of the second aspect of tins invention, the tags are used for the 
analysis of mixtures of peptides by liquid chromatography tandem mass spectrometry 
(LC-MS-MS). The use of the mass labels of this invention according to the second 
aspects will now be discussed in the context of the analysis of peptides. Peptide mass tags 
such as those in figures 1 and 2 may be used to label peptides. If the reactive functionality 
on these compounds is an N-hydroxysuccinimide ester then the tags will be reactive with 
free amino groups such as alpha- amm o groups and epsilon amino groups in lysine. 

After attachment of the tags, the labelled peptides will have a mass that is shifted by the 
mass of the tag. The mass of the peptide may be sufficient to identify the source protein. 
In this case only the tag needs to be detected which can be achieved by selected reaction 
monitoring with a triple quadrupole, discussed in more detail below. Briefly, the first 
quadrupole of the triple quadrupole is set to let through ions whose mass-to-charge ratio 
corresponds to that of the peptide of interest, adjusted for the mass of the marker. The 
selected ions are then subjected to collision induced dissociation (ODD) in the second 
quadrupole. Under the sort of conditions used in the analysis of peptides the ions will 
fragment mostly at the amide bonds in the molecule. The markers in figures 1 and 2 have 
an anride bond, which releases the N-terminal portion of the tag on cleavage. Although 
the tags all have the same mass, the terminal portion is different because of differences in 
the substituents on either side of the amide bond. Thus the markers can be distinguished 
from each other. The presence of the marker fragment associated with anion of a specific 
mass should confirm that the ion was a peptide and the relative peak heights of the tags 
from different samples will give information about the relative quantities of the peptides 
in their samples. If the mass is not sufficient to identify a peptide, either because a number 
of tor mina] peptides in the sample have the same terminal mass or because the peptide is 
not known, then sequence information may be determined by analysis of the complete 
CTD spectrum. The peptide fragmentation peaks can be used to identify the peptides 
while the mass tag peaks give information about the relative quantities of the peptides. 
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The analysis of proteins by tandem mass spectrometry, particularly mixtures of peptides, 
is complicated by the 'noisiness' of the spectra obtained Peptides isolated from 
biological samples are often contaminated with buffering reagents, denaturants and 
detergents, all of which introduce peaks into the mass spectrum. As a result, there are 
often more contamination peaks in the spectrum than peptide peaks and identifying peaks 
that correspond to peptides is major problem, especially with small samples of proteins 
that are difficult to isolate. As a result various methods are used to determine which 
peaks correspond to peptides before detailed CID analysis is performed Triple 
qtiadrupole based instruments permit 'precursor ion scanning' ( see Wilm M. et al., Anal 
Chem 68(3):527-33, "Parent ion scans of unseparated peptide mixtures." (1996)). The 
triple quadrupole is operated in 'single reaction monitoring' mode, in which the first 
quadrupole scans over the full mass range and each gated ion is subjected to CCD in the 
second quadrupole. The third quadrupole is set to detect only one specific fragment ion, 
which is usually a characteristic fragment ion from a peptide such as immonium ions. 
The presence of phosphate groups can also be detected using this sort of technique. An 
alternative method used with quadrupole/time-of-fhght mass spectrometers scans for 
doubly charged ions by identifying ions which when subjected to CID produce daughter 
ions with higher mass-to-charge ratios than the parent ion. A further method of 
identifying doubly charged ions is to look for sets of peaks in the spectrum which are only 
0.5 daltons apart with appropriate intensity ratios which would indicate that tile ions are 
the same differing only by the proportion of 13 C present in the molecule. 

By labelling peptides with the mass labels of this invention, a novel form of precursor ion 
s cannin g may be. envisaged in which peptide peaks are identified by the presence of 
fragments corresponding to the mass labels of this invention after subjecting the labelled 
peptides to CID. In particular, the peptides isolated from each sanq>le by the methods of 
this invention may be labelled with more than one tag. An equimolar mixture of a 
'precursor ion scanning* tag which is used in all samples and a sample specific tag may be 
used to label the peptides in each sample. In this way changes in the level of peptides in 
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different sauries will not have an adverse effect on the identification of peptide peaks in 
a precursor ion scan. 

Having identified and selected a peptide ion, it is subjected to CCD. The CID spectra are 
often quite complex and detennining which peaks in the CID spectrum correspond to 
meaningful peptide fragment series is a further problem in determining the sequence of a 
peptide by mass spectrometry. Shevchenko et al., Rapid Commun. Mass Spec. 11 : 1015- 
1024 (1997) describe a further method, which involves treating proteins for analysis with 
trypsin in 1:1 16 0/ 18 0 water. The hydrolysis reaction results in two populations of 
peptides, the first whose ter minal carboxyl contains 16 0 and the second whose terminal 
carboxyl contains l8 0. Thus for each peptide in the sample there should be a double peak 
of equal intensity for each peptide where the double peak is 2 Daltons apart This is 
complicated slightly by intrinsic peptide isotope peaks but allows for automated scanning 
of the CID spectrum for doublets. The differences in mass between doublets can be 
determined to identify the amino acid by the two fragments differ. This method may be 
applicable with the methods of this invention if N-terminal peptides are isolated. 

Protein Expression Profiling 

To understand the changes in a cancerous tissue, for example, requires an understan d ing 
of all of the molecular changes in that tissue, ideally relating these changes to normal 
tissue. To determine all of the molecular changes requires the ability to measure changes 
in gene expression, protein expression and ultimately metabolite changes. It is possible to 
compare the expression, between different tissue samples, of large numbers of genes 
simultaneously at the level of messenger RNA (mRNA) using microarray technology (see 
for exarqple Iyer V.R. et al., Science 283(5398): 83-87, "The transcriptional program in 
the response of human fibroblasts to serum." 1999), however mRNA levels do not 
correlate directly to the levels of protein in a tissue. To determine a protein expression 
profile for a tissue, 2-dimensional gel electrophoresis is widely used. Unfortunately, this 
technique is extremely laborious and it is difficult to compare two or more samples 
simultaneously on a 2-D gel due to the difficulty of achieving reproducibility. As 
discussed above peptides may be analysed effectively using the methods of this invention. 
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The tags of this invention allow the same peptide from different samples to be identified 
using LC-MS-MS. In addition, the relative quantities of the same peptide in different 
samples may be determined. The ability to rapidly and sensitively determine the identity 
and relative quantities of peptides in a number of samples allows for expression profiling. 
Therefore it is an object of this invention to provide improved methods for comparative 
analysis of complex protein samples based on the selective isolation and labelling of 
peptides. Two published approaches for the global analysis of protein expression are 
discussed and various methods for the analysis of particular protein states, such as 
phosphorylation and carbohydrate modification are also described below. 

Terminal peptide isolation for global protein expression profiling 
Isolation of N- or C-terminal peptides has been described as a method to determine a 
global expression profile of a protein sample. Isolation of terminal peptides ensures that at 
least one and only one peptide per protein is isolated thus ensuring that the complexity of 
the sample that is analysed does not have more components than the original sample. 
Reducing large polypeptides to shorter peptides makes the sample more amenable to 
analysis by mass spectrometry. Methods for isolating peptides from the termini of 
polypeptides are discussed in PCT/GB98/00201, PCT/GB99/03258. 

Isolation of peptides containing cysteine 

As discussed earlier, Gygi et al. (Nature Biotechnology 17: 994 - 999 1999) disclose the 
use of 'isotope encoded affinity tags 1 for the capture of peptides from proteins, to allow 
protein expression analysis. The authors report that a large proportion of proteins (>90%) 
in yeast have at least one cysteine residue (on average there are ~5 cysteine residues per 
protein). Reduction of disulphide bonds in a protein sample and capping of free thiols 
with iodoacetamidylbiotin results in the labelling of all cysteine residues. The labelled 
proteins are then digested, with trypsin for example, and the cysteine-labelled peptides 
may be isolated using avidinated beads. These captured peptides can then be analysed by 
liquid chromatography tandem mass spectrometry (LC-MS/MS) to determine an 
expression profile for the protein sample. Two protein samples can be compared by 
labelling the cysteine residues with a different isotqpically modified biotin tag. This 
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approach is slightly more redundant than an approach based on isolating terminal peptides 
as, on average, more than one peptide per protein is isolated so there are more peptide 
species in the sample than protein species. This increase in complexity is made worse by 
the nature of the tags used by Gygi et al 

As discussed above the affinity tags described by Gygi et al have some disadvantages. 
Labelling each sample with a different isotope variant of the affinity tag results in an 
additional peak in the mass spectrum for each peptide in each sample. This means that if 
two samples axe analysed together there will be twice as many peaks in the spectrum. 
Similarly, if three samples are analysed together, the spectrum will be three times more 
complex than for one sample alone. A further limitation, which is reported by the authors 
of the above paper, is the mobility change caused by the tags. The authors report that 
peptides labelled with a deuterated biotin tag elute slightly after the same peptide labelled 
with ah undeuterated tag. This means that comparative analysis of multiple samples will 
be very difficult using the methods of Gygi et al. because of the complexity of the mass 
spectra and the complexity of the chromatographic steps if more than 2 samples were 
analysed. 

An improved method for analysing protein samples by labelling cysteine residues is 
envisaged using tags of the form shown in Figure 8. This Figure illustrates a pair of 
improved affinity tags derived from methionine. Different isotopically substituted forms 
of methionine would be used to prepare the two different tags. The total mass of each of 
the two tags is the same but the N-terminal methionine in each tag differs from the other 
tag by three Daltons. The alpha amino group of the dipeptide tag has been guanidinated 
to differentiate the fragmentation product of this amino acid from the fragmentation 
product of the second methionine residue and the natural methionine residues in protein 
and to promote protonation at this position in the tag during ionisation in a mass 
spectrometer. In addition these tags comprise a thiol reactive maleimide functionality. 

In an embodiment of the second aspect of this invention, a protocol for the analysis of a 
protein sample containing polypeptides with cysteine residues comprises the steps of: 
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1. Reducing and reacting all cysteine residues in at least one protein sample with a 
maleimide affinity ligand mass tag; 

2. Cleaving the polypeptides with a sequence specific endoprotease; 

3. " Capturing tagged peptides onto an avichn derivitised solid support; and 
4 Analysing the captured tagged peptides by LC-MS-MS. 

The protein samples may be digested with the sequence specific endoprotease before or 
after reaction of the sample with the affinity ligand mass tag. 

Isolation of carbohydrate modified proteins 

Carbohydrates are often present as a post-translational modification of proteins. Various 
affinity chromatography techniques for the isolation of these sorts of proteins are known 
(For a review see Gerard C, Methods Enzymol. 182: 529-539, 'Turification of 
glycoproteins." 1990). A variety of natural protein receptors for carbohydrates are 
known. The members of this class of receptors, known as lectins, are highly selective for 
particular carbohydrate functionalities. Affinity columns derivatised with specific lectins 
can be used to isolate proteins with particular carbohydrate modifications, whilst affinity 
columns comprising a variety of different lectins could be used to isolate populations of 
proteins with a variety of different carbohydrate modifications. In one embodiment of the 
second aspect of this invention, a protocol for the analysis of a sample of proteins, which 
contains carbohydrate modified proteins, comprises the steps of: 

1. Treating the sample with a sequence specific cleavage reagent such as Trypsin or 
Lys-C; 

2. Passing the protein sample through affinity columns contain lectins or boronic acid 
derivatives to isolate only carbohydrate modified peptides; 

3. Labelling the captured sugar modified peptides at the free alpha amino group 
generated by the sequence specific cleavage, using the peptide mass tags of this 
invention; and 

4. Analysing the tagged peptides by LC-MS-MS. 
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An N-hydroxysuccinimide activated tag could be used to label the free alpba-amino 
groups. If Lys-C is used then each carbohydrate modified peptide will have a free 
epsilon-anrino group as well as a free alpha amino group, both of which can be tagged. 

Many carbohydrates have vicinal-diol groups present, i.e. hydroxyl groups present on 
adjacent carbon atoms. Diol containing carbohydrates that contain vicinal diols in a 1,2- 
cis-diol configuration will react with boronic acid derivatives to form cyclic esters. This 
reaction is favoured at basic pH but is easily reversed at acid pH. Resin immobilised 
derivatives of phenyl boronic acid have been used as ligands for affinity capture of 
proteins with cis-diol containing carbohydrates. In one embodiment of the fourth aspect 
of this invention a set of affinity ligand peptide mass tags comprising biotin linked to a 
phenylboronic acid entity could be synthesised, as shown in Figure 6b. These boronic 
acid tags could used to label two separate samples comprising peptides or proteins with 
carbohydrate mqdificatioiis that contain vicinal cis-diols. In another embodiment of the 
second aspect of this invention, a protocol for the analysis of a protein sample containing 
carbohydrate modified polypeptides comprises the steps of: 

1 . Reacting at least one protein sample at basic pH with a boronic acid affinity ligand 
mass tag, 

2. Cleaving the polypeptides with a sequence specific endoprotease, 

3. Capturing tagged peptides onto an avidin derivitised solid support; and 

4. Analysing the captured tagged peptides by LC-MS-MS. 

The sample may be digested with the sequence specific endoprotease before or after 
reaction of the sample with the affinity ligand mass tag. 

Vicinal-diols, in sialic acids for example, can also be converted into carbonyl groups by 
oxidative cleavage with periodate. Enzymatic oxidation of sugars containing terminal 
galactose or galactosarcrine with galactose oxidase can also convert hydroxyl groups in 
these sugars to carbonyl groups. Complex carbohydrates can also be treated with 
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carbohydrate cleavage enzymes, such as neuramidase, which selectively remove specific 
sugar modifications leaving behind sugars, which can be oxidised. These carbonyl groups 
can be tagged allowing proteins bearing such modifications to be detected or isolated. 
Hydrazide reagents, such as Biocytm hydrazide (Pierce & Warriner Ltd, Chester, UK) 
will react with carbonyl groups in carbcmyl-containing carbohydrate species (E A Bayer 
et al. , Anal. Biochem. 170: 271 - 281, cc Biocytin hydrazide - a selective label for sialic 
acids, galactose, and other sugars in glycoconjugates using avidin biotin technology", 
1988). Alternatively a carbonyl group can be tagged with an amine modified biotin, such 
as Biocytin and EZ-Link™ PEO-Biotin (Pierce & Warriner Ltd, Chester, UK), using 
reductive alkylation (Means G.E., Methods Enzymol 47: 469-478, Reductive alkylation 
of amino groups." 1977; Rayment L, Methods Enzymol 276:' 171-179, "Reductive 
alkylation of lysine residues to alter crystallization properties of proteins." 1997). 
Proteins bearing vicinal-diol containing carbohydrate modifications in a complex mixture 
can thus be biotinylated. Biotinylated, hence carbohydrate modified, proteins may then 
be isolated using an avidinated solid support. 

A set of peptide mass tags according to this invention can be synthesised for the analysis 
of carbohydrate modified peptides that have been oxidised with periodate, as shown in 
Figure 6a. Figure 6a shows a set of two tags derived from methionine. Different 
isotopically substituted forms of methionine would be used to prepare the two different 
tags. The total mass of each of the two tags is the same but the N-terminal methionine in 
each tag differs from the other tag hy three Daltons. The alpha amino group of the 
dipeptide tag has been guanidinated to differentiate the fragmentation product of this 
amino acid from the fragmentation product of the second methionine residue and to 
promote protonation at this position in the tag during ionisation in a mass spectrometer. 
A further embodiment of the second aspect of this invention comprises the steps of: 

1. Treating a sample of polypeptides with periodate, so that carbohydrates with 
vicinal cis-diols on glycopeptides will gain a carbonyl functionality; 

2. Labelling this carbonyl functionality with a hydrazide activated peptide mass tag 
linked to biotin, as shown in Figure 6a; 
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3. Digesting the protein sample with a sequence specific endoprotease; 

4. Capturing tagged peptides onto an avidin derivitised solid support; and 

5. Analysing the biotinylated peptides by LOMS-MS. 

The protein sample may be digested with the sequence specific endoprotease before or 
after reaction of the sample with the affinity ligand mass tag. 

Isolation ofPhosphopeptides 

Phosphorylation is a ubiquitous reversible post-translational modification that appears in 
the majority of signalling pathways of almost all organisms as phosphorylation is widely 
used as a transient signal to mediate changes in the state of individual proteins. It is an 
important area of research and tools which allow the analysis of the dynamics of 
phosphorylation are essential to a full understanding of how cells responds to stimuli, 
which includes the responses of cells to drugs. 

Techniques for the analysis of phosphoserine and phosphothreonine containing peptides 
are well known. One class of such methods is based on a well known reaction for beta- 
elimination of phosphates. This reaction results in phosphoserine and phosphothreonine 
forming dehydroalanine and methyldehydroalanine, both of which are Michael acceptors 
and will react with thiols. This has been used to introduce hydrophobic groups for 
affinity chromatography (See for example Holmes C.R, FBBS Lett 215(1): 21-24, "A 
new method for the selective isolation of phosphoserine-containing peptides." 1987). 
Dithiol linkers have also been used to introduce fluorescein and biotin into phosphoserine 
and phosphothreonine con taining peptides (Fadden P, Haystead TA, Anal Biochem 
225(1): 81-8, '^Quantitative and selective fluorophore labelling of phosphoserine on 
peptides and proteins: characterization at the attomole level by capillary electrophoresis 
and laser-induced fluorescence." 1995; Yoshida O. et al., Nature Biotech 19: 379 - 382, 
Enrichment analysis of phosphorylated proteins as a tool for probing the 
phosphoproteome" 2001). The method of Yoshida et aL for affinity enrichment of 
proteins phosphorylated at serine and threonine could be improved by using the 
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maleimide tag shown in figure 8 to allow the comparison of multiple samples. This 
would be particularly useful for the analysis of the dynamics of phosphorylation cascades. 

A tag peptide of the form shown in figure 7 would allow direct labelling of beta- 
eliminated phosphothreonine and phosphoserine residues without a dithiol linker. The tag 
tetrapeptide of figure 7 is derived from methionine. Different isotopically substituted 
forms of methionine would be used to prepare the two different tags. The total mass of 
each of the two tags is the same but the N-texminal methionine in each tag differs from the 
other tag by three Daltons. The alpha amino group of the dipeptide tag has been 
guanidinated to differentiate the fragmentation product of this amino acid from the 
fragmentation product of the second methionine residue and natural methionine residues 
in proteins and to promote protonation at this position in the tag during ionisation in a 
mass spectrometer. The tag peptide is guanidinated at the N-Terminus to provide 
enhanced sensitivity and to distinguish the N-terminal residue from the C-terminal 
residue. The cysteine residue provides a free thiol, which can nucleophihcally attack 
dehydroalanine and methyldehydroalanine. An improved protocol for the beta- 
elimination based labelling procedure is known. This improved procedure involves 
barium catalysis. (Byford M.F., Biochem J. 280: 261^261, "Rapid and selective 
modification of phosphoserine residues catalysed by Ba2+ ions for their detection during 
peptide miCTosequencing." i991) This catalysis makes the reaction 20-fold faster reducing 
side-reactions to undetectable levels. The tag peptide shown in figure 7 could be easily 
coupled to dehydroalanine or methyldehydroalanine generated from beta-elimination of 
phosphates using barium catalysis. Thus in a further embodiment of the second aspect of 
this invention, peptides phosphorylated at serine and threonine may be analysed in a 
method comprising the steps of: . 

1. Treating a sample of polypeptides with barium hydroxide to beta-eliminate 
phosphate groups from phosphoserine and phosphothreonine; 

2. Labelling the resultant dehydroalanine or methyldehydroalanine functionalities 
with the thiol activated peptide mass tag linked to biotin, as shown in Figure 7; 

3. Digesting the protein sample with a sequence specific endoprotease, 
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4. Capturing tagged peptides onto an avidin derivitised solid support; and 

5. Analysing the biotinylated peptides by LC-MS-MS. 

The protein sample may be digested with the sequence specific endoprotease before or 
after reaction of the sample with the affinity ligand mass tag. 

A number of research groups have reported on the production of antibodies, which bind to 
phosphotyrosine residues in a wide variety of proteins, (see for example A.R. Frackelton 
et aL, Methods Enzymol 201: 79-92, "Generation of monoclonal antibodies against 
phosphotyrosine and their use for affinity purification of phosphotyrosine-containing 
proteins.", 1991 and other articles in this issue of Methods Enzymol.). This means that a 
significant proportion of proteins that have been post-translationally modified by tyrosine 
phosphorylation may be isolated by affinity chromatography using these antibodies as the 
affinity column ligand 

These phosphotyrosine binding antibodies can be used in the context of this invention to 
isolate peptides from proteins containing phosphotyrosine residues. The tyrdsine- 
phosphorylated proteins in a complex mixture may be isolated using anti-phosphotyrqsine 
antibody affinity columns. In a further embodiment of the second aspect of this invention,- 
a protocol for the analysis of a sample of proteins, which contains proteins 
phosphorylated at tyrosine, comprises the steps of: 

1. Treating the sample with a sequence specific cleavage reagent such as Trypsin or 
Lys-C; 

2. Passing the protein sample through affinity columns contain anti-phosphotyrosine 
antibodies to isolate only phosphotyrosine modified peptides; 

3. Labelling the captured phosphopeptides at the free alpha amino group generated 
by the sequence specific cleavage, using the peptide mass tags of this invention; and 

4. Analysing the tagged peptides by LC-MS-MS. 
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An N-hydroxysuccinimide activated tag could be used to label the free dpha-aminb 
groups. 

Immobilised Metal Affinity Chromatography (IMAC) represents a further technique for 
the isolation of phosphoproteins and phosphopeptides. Phosphates adhere to resins 
comprising trivalent metal ions particularly to Gallium(III) ions (Posewitch, M.C and 
Tempst, P., Anal. Chem., 71: 2883-2892, "Immobilized Gallium (EOT) Affinity 
Chromatography of Phosphopeptides", 1999). This technique is advantageous as it can 
isolate both serine/threonine phosphorylated and tyrosine phosphorylated peptides and 
proteins simultaneously. 

IMAC can therefore also be used in the context of this invention for the analysis of 
samples of phosphorylated proteins. la a further embodiment of the second aspect of this 
invention, a protocol for the analysis of a sample of proteins, which contains 
phosphorylated proteins, comprises the steps of: 

1. Treating the sample with a sequence specific cleavage reagent such as Trypsin or 
Lys-C; 

2. Passing the protein sample through an affinity column comprising immobilised 
metal ions to isolate only phosphorylated peptides; 

3. Labelling the captured phosphopeptides at the free alpha amino group generated 
by the sequence specific cleavage, using the peptide mass tags of this invention; 
and 

4. Analysing the tagged peptides by LC-MS-MS. 

An N-hydroxysiKx;inimide activated tag could be used to label the free alpha^anrino 
groups. 

In an alternative embodiment of the second aspect of this invention, a sample of 
phosphorylated proteins may be analysed by isolating phosphorylated proteins followed 
by analysis of the N or C terminal peptides of the phosphoproteins. Techniques for the 
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isolation of terminal peptides are disclosed in a number of patent applications, e.g. 
W098/32876, WO 00/20870 and EP 01304975.4. A protocol for the analysis of a sample 
of proteins, which contains phosphorylated proteins, would comprise the steps of: 

1. Passing the protein sample through an affinity column comprising immobilised 
metal ions to isolate only phosphorylated proteins; 

2. Isolating C and/or N terminal peptides from the captured phosphorylated proteins; 

3. Labelling the captured terminal peptides, using the peptide mass tags of this 
invention; and 

4. Analysing the tagged peptides by LC-MS-MS. 
Examples 

Example 1 - Syntheses ofX-Metd*-Met-Gly-OHCA) andofX-Met-Metd*-Gly-OH(B) 

A pair of peptides were synthesized using conventional automated synthesis techniques to 
illustrate the features of this invention (both starting from commercially available Fmoc- 
Gly-Trt-PS resin from Rapp Polymere, Germany). The two peptides A and B are shown 
in Figure 10 and will be referred to as the two Met-Met-Gly (D3) peptides, 

Deuterated methionine (Metd 3 ) is available from ISOTEC Inc, Miamisburg, Ohio, USA 
The Fmoc reagent for use in a peptide synthesiser must, however, be synthesised 
manually from the unprotected deuterated methionine. 

Synthesis ofN-(9-FlwrenylmetJwxycarbo^ (Fmoc-Metd 3 ) 

The synthesis of Fmoc-Metd 3 (shown in Figure 9a) was carried out in two steps. 

1. Synthesis of 9-Fluorertylmethyl-pentqfluorphenyI carbonate 

8.4mL (60mmol) triethylamine were added at 0°C to a mixture of llg (60mmol) 

Pentafluarophenol and 15,5g (60mmol) chlorofonnic acid (9-fiuorenylmethyi) ester in 
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lOOmL dry ether. After 2 hours reaction, 20 mL cold water was poured to the solution. 
The organic layer was washed twice with water, dried. After evaporation of the solvent, 
the obtained product was crystallized from heptane. Yield: 16,4g (67%) 

2. Synthesis ofN-(9-Fluorenylmeihoxycarbonyl)-I^me^ 

2£g (14,5mmol) I^methionine-methyl-d3 (Metd 3 ) was suspended in 50mL acetone. 2.5g 
(29mmol) sodium hydrogencarbonate and 60 mL water and then 5,7g (14mmol) 9- 
Fluorenylmefhyl-pentafluorphenyl carbonate were added to the stirred suspension. After 
48 hours reaction, the pH of the clear solution was altered to pH3 and the organic layer 
was extracted by ethylacetate. After drying the extracted organic layer, the ethylacetate 
was evaporated and the product was precipitated by addition of heptane. That procedure 
(dilution with ethylacetate and precipitation by hexane) was repeated twice before 
obtaining the pure product, Fmoc-Metd 3 . (Yield: 5,0g (92%); Fp: 126-128°C; [afe 20 — 
30°,c=l,DMF) 

The reaction sequences of the peptide synthesiser for the preparation of the two peptides 
shown in Figure 10 are listed below. 

Peptide sequence (A) 

• Swelling of 50mg of Fmoc-Gly-Trt-PS resin for 5min in 2ml of 
dimethylfoimarQide (DMF); 

• Removal of the Fmoc group with piperidine in DMF following standard protocols; 

• Dissolving of 49mg (0.32mmol) of 1-hydroxybenzotriazole (HOBt) in 800^1 
DMF; 

• Addition of 120mg (0.32mmol) Fmoc-Met to the HOBt solution; this solution was 
added to the resin and incubated for 3min; 

• 50}il (0.32mmol) of diisoprc^ylcarbodiimide (DIQ was then added; coupling time 
(0.4M activated amino acid) 

• Removal of the Fmoc group with piperidine in DMF following standard protocols. 

• Dissolving of 49mg (0.32mmol) HOBt in 800^1 DMF; 
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• Adding to 120mg (0.32mmol) Fmoc-Metd 3 ; this solution was added to the resin 
and incubated for 3min; 

• 50pl (0.32mmol) DIG was then added; coupling time (0.4M activated amino acid) 

• Removal of the Fmoc group with piperidine in DMF following standard protocols; 

• 150 mg (0.32tnmol) ,3oc2X-OSu" were dissolved in 800^1 DMF and this solution 
was added to the resin; 

• 53jxl of Diisopropylethylarnine (DIPEA) were then added to the resin, and the 
coupling was left to proceed for 3 hours (0.4M activated species); 

• After washing the resin the desired substance was cleaved from the resin with 1ml 
TFA containing 2.5% H 2 0, Bt 3 SiH and thioanisole each within Ih; 

• Adding of 30ml water to TFA solution after filtration, removing of all solvent hy 
lyophilisation. 

A white powder of peptide sequence (A) resulted 

Peptide sequence (B) 

• Swelling of 50mg resin 5min in 2ml DMF; 

• Removal of the Fmoc group with piperidine in DMF following standard protocols; 

• Dissolving of 49mg (0.32mmol) HOBt in 800^1 DMF; 

• Adding to 120mg (0.32mmol) FmooMetd 3 ; this solution was added to the resin 
and incubated for 3min; 

• 50fil (0.32mmol) DIC was added; coupling time (0.4M activated amino acid) 

• Removal of the Fmoc group with piperidine in DMF following standard protocols; 

• Dissolving of 49mg (0.32mmol) HOBt in 800^1 DMF; 

• Adding to 120mg (0.32mmol) Fmoo-Met; this solution was added to the resin and 
incubated for 3mm; 

• 50fxl (0.32mmol) DIC was added; coupling time (0.4M activated amino acid) 
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• Removal of the Fmoc group with piperidine in DMF following standard protocols; 

• 150 mg (0.32mmol) „Boc2X-OSu" were dissolved in 800^1 DMF and this solution 
was added to the resin; 

• 53 |il DIPEA were added, 3h coupling time (0.4M activated species) 

• After washing the resin the desired substance was cleaved from the resin with 1ml 
TFA containing 2.5% H 2 0, Et 3 SiH and thioanisole each within lh; 

• Adding of 30ml water to TFA solution after filtration, and removal of all solvent 
by lyophilisation. 

A light yellow powder of peptide sequence (B) resulted 

HPLC 

After cleavage, ca. 80% pure product was obtained for each peptide. The products were 
then purified by HPLC. 

MS 

The identity of the peptides A and B was confirmed by mass spectrometry. A mass-to- 
charge ratio of 496 was observed as the main peak in both MALDI and ESI mass spectra 
for both products, which fits the calculated mass of both peptides. A mass spectrum from 
the analysis of a mixture of peptides A and B by ESI mass spectrometry is shown in 
Figure 11. It can be seen that the two peptides have mass spectra that overlap almost 
exactly, which is as expected. 

MS/MS 

Figure 13 shows the proposed fragmentation reaction mechanism for the products of 
collision induced dissociation of the model peptides A and B shown in figure 10. Figure 
12 shows a pair of ESI MS/MS spectra generated by an LCQ ion trap mass spectrometer 
from Finnigan MAT. The ESI MS/MS spectra show the fragmentation products of 
peptides A and B. The desired b2-fragment ion (see Figure 10) has a high intensity for 
both substances (273 after loss of ammonia for A and 270 after loss of ammonia for B). 
Figure 14 shows and ESI-MS/MS spectrum of the fragmentation products from the 
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analysis of amixture of peptides A andB. A andB were present in the mixture in a ratio 
of 70:30 respectively. This ratio can be seen in the intensities of the b2-fragment ion 
peaks at m/z 273 and 270 for peptides A and B respectively. This spectrum shows that 
the tags can reveal the ratio of their associated peptides when pairs of samples are 
compared. Figure 15 shows a linear regression curve for a series of ESI-MS/MS 
experiments with peptides A and B. The graph shows a plot of the ratio of A to B in the 
mixture against the observed intensities of the b2-fragment ions from ESI-MS/MS 
analysis of the mixtures. The graph shows that there is a good correspondence between 
the expected and observed ratios. 

Example 2 - 6-[Bis(tert-buiyl-oxycarbonyl) guanidinoj-hexanoic acid-N- 
hydroxysuccinimidester 

The synthesis of the guanidino- active ester linker shown in Figure 9b was carried out in 3 
stages shown below. 

1. Synthesis of amino-iminomethane sulphonic acid 

50mL acetic anhydride and 2 drops of cone, sulphuric acid were added to 45g (397mmol) 
30% aqueous hydrogen peroxide under ice cooling. After 30 minutes, lOOmL 
(1157mmol) acetic anhydride was added to the solution at 10-12°C once again The 
reaction mixture was stirred overnight and reached the room temperature in that time. 
After adding 150 mL methanol, the solution made from lOg (131mmol) thiourea in 
500mL methanol was dropped slowly into the reaction at 15-20°C The reaction was 
stirred at RT for 48 hours. After filtration, the solution was condensed to 60 mL. The 
obtained product was filtered and washed with ethanol and purified by crystallisation 
from acetic acid (ca. 1L). Yield: 6,0g (37%). 

2. , Synthesis of 6-Guanidinohexanoic acid 

6.5g (50mmol) 6-aminohexaiioic acid and 6.9g (50mmol) sodium carbonate were 
dissolved in 50mL water. 6.2g (50mmol) amino-iminomethane sulphonic acid was added 
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under stirring to the solution. After 20 hours, the product was filtered and washed with 
acetic acid, methanol and then ether. Yield: 6.6g (76%). 

3, Synthesis of 6-[Bis (tert-butyl-oxycarbonyl) guanidinoj-hexanoic acid-N-hydroxy 
succinimide ester 

9.5g (55mmol) 6-Guanidinohexanoic acid and 55g (270mmol) N,0- 
Bistrimethylsilylacetamide were stirred in lOOmL dichloromethane and heated under 
refluxing until a clear solution was obtained (the reaction was left for approximately 10 
hours). 46g (210mmol) Di-terf-butyl pyrocarbonate was added to the solution at RT and 
the reaction mixture was heated under refluxing for 3 hours after having been stirred at 
RT for 18 hours (overnight). The solution was then cooled to RT and washed with a 10% 
citric acid solution and a sodium chloride solution. After evaporation of the solvent, the 
pyrocarbonate was distilled at 80-90°C under vacuum. The viscous liquid obtained (30g) 
was dissolved in 100ml dichloromethane with 8,6g (75mmol) N-Hydroxysucxrinimide. 
15,5g (75mmol) dicyclohexylcarbodiimide (DCC) was added in portions to the reaction 
mixture with stirring at RT. After 17 hours, the urea was removed by filtration. The 
solution was washed with a 10% citric acid solution and after removing the solvent, the 
product was purified by chromatography (silica gel, solvent: 
dichloromethane/ethylacetate). The product was then crystallized from diisopropylether. 
Yield: 6,0g (19%). Rf: 0,77 (dichloromethane/ethylacetate : 3/1). Fp: 108-109°C. 

Example 3 

Experimental protocols 

Two pairs of TMT reagents are shown in Figures 18a and 18b. The reagents are peptide 
tags according to this invention comprising one 'tag* amino acid linked to a sensitisation 
group ([1] , [2], [3]), which is a guanidino-functionality, one 'mass normalisation' amino 
acid and in the second pair of tags, a cleavage enhancement group, which is proline in this 
case ([4]). These tags are designed so that on analysis by collision-induced dissociation 
(ODD), the tag fragment is released to give rise to an ion with a specific mass-to-charge 
ratio. The current accepted model of peptide scission during CED requires protonation of 
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the peptide backbone followed by nucleophilic attack of the carbonyl moiety of the 
protonated amide by the next N-termtnal carbonyl residue in the peptide chain to form a 
relatively stable oxazolone leading to scission of the amide bond ([5]). The sensitization 
enhancer is linked to the N-tenninal methionine residue by an amide bond but cleavage 
does not take place at this amide as there is no amide correctly positioned to allow 
cyclisation and cleavage at this position so cleavage can only take place between the two 
methionine residues. This means that the N-terminal methionine is distinguished from the 
second methionine by the mass of the guanidino sensitisation group. Thus each pair of 
tags allows a pair of peptides to be distinguished by MS/MS analysis. Each tag can also 
bear a reactive functionality. In the figure, the reactive functionality, R, is not specified 
but could be an N-hydroxysuccinimide ester, which allows for the specific labelling of 
amino-groups. Clearly this reactive functionality can be easily varied to allow different 
biological nucleophiles to be labelled. In addition, the tag design can be readily modified 
to accommodate an affinity ligand such as biotin. Furthermore, it should be clear that 
more than two tags can be generated allowing for comparison of additional samples or for 
the introduction of labelled standards. 

In the following examples, peptides, listed in Table 7, have been synthesised as if they 
have been completely labelled on the alpha amino group with the above tags, i.e. the tag 
was 'pre-incorporated* during the synthesis to test the performance of the tags 
independently of the labelling reactions, so that in the following examples the *R' group 
shown in Figure 18a and 18b is the peptide sequence to which the tag is attached. The 
tagged peptides were analysed by ESI-MS/MS and LC-ESI-MS/MS. 

Figures 18a and 18b show the structures of two versions of the TMT markers. The tags 
are modular comprising different functional components that correspond to individual 
synthetic components in the automated synthesis of these reagents. Each tag comprises a 
sensitisation group and a mass differentiated group that together comprise the 'tag 
fragment' that is actually detected. The tag fragment is linked to a mass normalisation 
group that ensures that each tag in a pair of tags share the same overall mass and atomic 
composition. The first and second generation tags are distinguished by the presence of an 
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additional fragmentation enhancing group, proline, in the second generation tag. The tags 
will additionally comprise a reactive functionality (R) to enable the tag to be coupled to 
any peptide but in the present experiments, R is one of a number of peptide sequences. 
The proposed tag fragment that results from the markers is shown in Figure 18c based on 
current theories on backbone protonation dependent mechanisms of fragmentation ([5]). 

Syntheses of TMT labelled peptides 

The peptides shown in Table 7 were synthesised using conventional automated Fmoc 
synthesis techniques (both starting from commercially available Fmoc-Gly-Trt-PS resin 
from Rapp Polymere, Germany). Deuterated inethionine (Metd 3 ) is available from 
ISOTEC Inc,'Mamisburg, Ohio, USA. An Fmoc-Metd 3 reagent for use in a peptide 
synthesiser was synthesised manually from the unprotected deuterated methionine as 
described above. The guanidino 'sensitisation 5 enhancement group was synthesized as an 
N-hydroxysuccinimide ester (NHS-ester) as described above and added to deprotected 
alpha-amino groups of synthetic peptides by conventional methods during automated 
peptide synthesis. 
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Table 7 



Peptide sequences 


Generation 1 


Generation 2 




Mj 


Ion at m/z 
(z) 


Mi 


Ion at m/z 

w 


1A 


TMT-GVATVSLPR 


1319.7 


660.9 (2+) 


1415.7 


708.9 (2+) 


IB 


TMT-GVATVSLPR 


1319.7 


660.9 (2+) 


1415.7 


708.9 (2+) 


2A 


TMT- 

GLGEHMDVLEGNEOFINAAK 


2688.31 


897.1 (3+) 


2784.3 


928.8 (3+) 


2B 


TMT- 

GLGEHNIDVLEGNBQFINAAK 


2688.31 


897.1 (3+) 


2784.3 


928.8 (3+) 


3A 


TMT-GNKPGVYTK 


1383.7 


462.2(3+) 


1479.7 


494.3 (3+) 


3B 


TMT-GNKPGVYTK 


1383.7 


462.2 (3+) 


1479.7 


494.3 (3+) 


4A 


TMT- 

GDPAALKRARNTEAARRSRAR 
KLQRMKQGGC 


3874.6 


969.7 (4+) 


3970.6 


993.7 (4+) 


4B 


TMT- 

GDPAALKRARNTEAARRSRAR 
KLQRMKQGGC 


3874.6 


969.7 (4+) 


3970.6 


993.7 (4+) 



Table 7: Abundance, ratio experiments wore performed with the peptides listed above. 
HPLC experiments were performed with the first three peptide sequences listed above. 
Pairs of synthetic peptides were prepared with either the first or second TMT pre- 
incorporated into the peptide sequence at the N-terminus. Sequences, mono-isotopic 
molecular mass and mass-to-charge ratios of predominant ion species are listed for each 



MS/MS analysis ofTMT-labelled peptides 

Analyses were performed by liquid chromatography mass spectrometry using either a 
Finnigan LCQ Deca with a Finnigan Surveyor HPLC System (Column: 50 x 2.1mm, 
5fim HyPURITY™ Elite C18) or a QTOF 2 fromMicromass Ltd, Manchester, UK with a 
Cap-LC HPLC system from LEAP Technologies (Column: PepMap C18 HPLC column 
from Dionex with a 75jim inner diameter was used; the resin had a 3jim particle size, 
100A pore size). 
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Ion abundance ratios were determined by summation and averaging of a number of 
spectra of an eluting peptide pair followed by detemrination of the ratios of the peak 
intensities for the tag fragments. 

Example 3a - Comparison of 1 st and ^generation TMTtags 

To demonstrate the advantages of a tag designed with a fragmentation enhancing group 
two different TMT designs were explored The tags differ by the inclusion of proline in 
the 2 nd generation tags (Figure 18a and 18b). Proline is known to enhance cleavage of the 
amide bond on its N^terminal side ([4]). 

Initial experiments on the fragmentation of the 1 st generation of TMT in a Micromass 
QTOF 2 instrument showed that the intensity of the TMT fragments was very dependant 
on the amino acid sequence of the peptide and at low collision energies the tag fragments 
did not accurately reflect the abundances of the tagged peptides. As shown in figure lc 
the expected tag fragments have an m/z of 287 or 290 but, in the first generation tags, a 
second pair of ions with mass-to-charge ratios of 270 or 273 is observed. These 
fragments are thought to result from the loss of ammonia from the expected tag 
fragments. An example of a typical CID spectrum for a peptide labelled with the first 
generation tags is shown in Figure 19. At lower collision energies the intensities of these 
two fragment classes varied with the sequence of the attached peptide but at higher CID 
energies the 270/273 fragments are observed almost exclusively. At these higher collision 
energies, the 270/273 tag fragments did accurately reflect the abundances of the peptide 
pairs. Additional experiments using a Finnigan LCQ ion trap mass spectrometer have 
shown the same fragmentation pattern as the QTOF for the first generation TMT units. 
The observed ammonia loss occurs in both LCQ and QTOF experiments. These 
instruments differ in the manner in which CID is carried out (selective activation and 
fragmentation of only the parent ion in the LCQ versus serial fragmentation of all ions in 
QTOF). Since the loss of NH3 takes place in both instruments, this suggests that the loss 

of NH3 may take place directly from the parent peptide ion, rather than as a result of 
subsequent collisions of the expected fragment ion and is an intrinsic feature of this tag 
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structure, la both instruments the appearance of the 270/273 fragment is favoured by 
higher collision energies. This meant that to get consistent behaviour from this tag 
analysis had to take place at high collision energies. 

Although CiD is more selective in the LCQ, it is unfortunately limited in its use with 
TMTs as it is not possible to detect small CID fragmentation products of larger precursors 
with this type of instrument In the QTOF instrument, however, at the higher energies of 
collision, consecutive fragmentations were problematic. In the Q-TOF, the series of b- or 
y-ion fragments that provide sequence information are further fragmented to give smaller 
species so that no sequence information could be obtained from the peptide. As a result 
of the need for high energy CID to guarantee the release of the tag fragments and to 
obtain accurate quantification, the first generation TMT units can only be reliably used for 
the purposes of quantification without peptide identification in the QTOF. This will also 
be true of other serial MS/MS instruments. 

Figure 19a and 19b show typical CID spectra for a peptide labelled with the first 
generation TMT at collision energies of 40V (Figure 19a) and 70V (Figure 19b). In 19a 
weak peaks in both of the 2101213 and 287/290 regions can be seen at 40V, but they do 
not accurately represent the abundances of the tagged peptides. Some sequence specific 
y-series ions can be observed though at this accelerating potential. In 19b the peaks 
corresponding to the tag fragment can he seen clearly at m/z 270 and 273 for the first 
generation TMT at a collision energy of 70V. At this collision energy the intensities of 
these peaks accurately represent the relative abundances of each peptide (see inset for 
zoom of the tag region in Figure 19b) but no sequence data can be determined 

These results lead to the development of a 2 nd generation TMT, which has a proline 
residue in the TMT unit to enhance the fragmentation. To quantify the effect of the 
proline in the second generation tags a 50:50 mixture of a peptide labelled with the first 
and second generation tags respectively was analysed by MS/MS. The two resultant 
peptides, with the sequences Guanidinocaproyl-Met(D3)-Met- 
GLGBHNIDVLEGNEQFINAAK and Guanidinocqjroyl-Met(D3)-Pro-Met- 
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GLGEHMDDVLEGNEQFINAAK, had ions corresponding to the [M+3H] 3 * species at 
mass-to-charge ratios of approximately 897 and 929 for the first and second generation 
tags respectively. To get the same collision conditions for both precursors, the peptides 
were first mixed and then analysed in a QTOF instrument with the quadrupole set to 
alternately select ions with m/z around 897 or 929. Each selected ion was subjected to 
ODD at increasing collision energies. 

At collision energies of 20V or less no fragmentation at all was observed for either type of 
TMT. At a collision energy of 30V-35V it is possible to see Hie expected TMT fragment 
ions at in/z of 290 in the CID spectrum for the peptide with the second generation tag but 
no fragment ions m/z of 273 can be seen in the spectrum for the peptide with the first 
generation tag at the same energy, see Figure 20, although a weak fragment at m/z of 290 
can be seen. The tag fragment for the peptide containing the first generation TMT is not 
observed until a collision energy of 70V is used (data not shown). Smaller peptides 
labelled with the first generation TMT gave rise to the tag fragment at lower energies but 
high collision energies were required to release the tag fragment from larger peptides. 
The size dependence of the peptide on the energy needed to release the tag fragment was 
much smaller for the second generation TMT. Comparison of the CID spectra from 
peptides labelled with TMTs containing proline with peptides labelled with TMTs 
without proline shows clearly that the introduction of the proline amino acid as a 
fragmentation enhancer leads to fragmentation in favour of the expected TMT tag 
fragment without resorting to very high collision energies. At these lower energies the 
abundance ratios also of the TMT fragment ions, from the proline containing TMTs, 
accurately reflect the ratios of the concentrations of the tagged peptides. In addition, the 
identification of the peptide via its b and y series can also be performed at these lower 
collision energies. 

Figure 20a 20b and 20c show MS and MS/MS spectra for triply charged ions of the 
peptide 2 (see Table 7) labelled with the first and second generation TMTs. The peptides 
were analysed in a QTOF II instrument Figure 20a shows the MS-mode TOF spectrum 
of the peptide mixture. For CID analysis the first quadrupole was set to alternately select 
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ions with m/z around 897 or 929. The CID spectrum at 35V for Guanidinocaproyl- 
Met(D3>MetGLGEHNIDVLEG]SIEQFINAAK is shown in Figure 20b and the CID 
spectrum at 35V of Guanidino(^royl»Met(D3)-Pro-Met- 

GLGEHNIDVLEGNEQFINAAK is shown in Figure 20c. The presence of the expected 
tag fragment at m/z of 273 is not detected for the first generation TMT in Figure 20b but 
the expected fragment at 290 is clearly observed at 35V for the second generation TMT in 
Figure 20c. 

The improved behaviour of the second generation TMT can be seen in Figure 21 which 
shows a typical CID spectrum of a peptide labelled with these tags. The tag fragments 
revealing the abundance ratios are easily seen at the expected m/z values of 287 and 290. 
In addition it is possible to see both b-series and y-series ions allowing the sequence of 
the peptide to be determined. CID was performed at a relatively low collision energy of 
40V. The peaks at m/z 287 and 290 for the second generation TMT at 40V represent the 
relative abundances of each peptide (see inset with zoom of the relevant region of the 
mass spectrum). 

Figure 22 clearly shows that the charge state of the TMT tagged peptide does not affect 
the appearance of the tag fragments in the CID spectra of the labelled peptides. In this 
example a peptide labelled with a first generation TMT is shown but the same result is 
found for the second generation tags. This is advantageous as it means that scanning of 
the spectrum can take place without complex adjustments of the scanning software to 
compensate for the charge state of each peptide. In other isotope tagging procedures, 
such as ICAT, the charge state alters the mass difference between each tagged ion pair, 
such that for doubly charged ions the mass difference is halved, for triply charged ions the 
mass difference is a third of that for the singly charged ions, etc. Software to scan for 
peptide pairs using conventional isotope labelling techniques, like ICAT, must therefore 
compensate for these sorts of problems by allowing for the different possible mass 
differences or by ignoring certain classes of ion, which either increases the chance of 
erroneous identification of peptide pairs or misses out on potential ion pairs that could 
offer useful information. 
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111 Figure 22, comparison of spectra for peptide 4 from Table 7 where CID has been 
performed on the [M+4H] 4 + (bottom spectrum) and [M+5H] 5+ (top spectram) species. 
The peptide above contains Hie first generation TMT. The 4+ ion has an m/z of 969.3 
while the 5+ ion has an m/z of 775.6. The tag fragment ion appears at the expected mass- 
to-charge ratio of 273 in both spectra indicating that only one charge localises to the tag 
fragment 

Figure 23 shows data for expected and observed ratios of peptides from ESI-MS/MS 
analyses of the 4 peptides listed in Table 7. Peptides with both first and second 
generation TMTs incorporated into them were analysed. Abundance ratios were 
determined by analysing the peak maxima at the d3 (A) and dO (B) of the tag fragment ion 
peaks after peak normalization at 290 and 287 for TMT2. Measurements were made in a 
QTOF instrument The table inset to Figure 23 shows expected and observed ratios the 
b-ion fragments from the MS/MS analysis of eluting TMT labelled peptides. It can be 
seen that both generations of TMT provide accurate representation of abundance ratios of 
the peptides in the mixtures and that the tags show linear behaviour over the entire range 
of peptide ratios tested 

Example 3b - Demonstration of identical chromatographic behaviour of TMT tags in 
LC-MS 

A mixture of four pairs of synthetic peptides were synthesised with the second generation 
TMT units pre-incoiporated at the N-terininus of each peptide. The peptide pairs were all 
analysed together. Each peptide pair was prepared at a different ratio. The sequences, 
theoretical mono-isotopic masses, the doubly charged ion masses are shown in Table 7. 
The peptides were loaded onto a C-18 reverse phase HPLC column and separated The 
purpose of this experiment was to demonstrate the exact co-elution of corresponding pairs 
of peptides with different TMT tags without any other complications. The ratios of the 
peptide pairs were expected and found to be consistent over the entire elution time for 
each peptide pair and so a further object of this experiment was to show that 
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quantification of the peptide pairs could be performed with simultaneous sequence 
determination and that it would be possible to scan for other peptides without waiting for 
the complete elution of the peptide. Complete elution of peptide pairs is necessary for 
accurate quantification using the ICAT strategy and other peptide analysis techniques 
using conventional isotope labelling. This greatly restricts the throughput of these 
approaches. 

Figure 24 shows the co-elution of each peptide pair, peptides A and B for each peptide 
from Table 7, clearly seen in the C18-reverse phase HPLC traces. For each peptide the 
ion currents at m/z 287 and 290 are recorded corresponding to the tag fragments from 
each of the TMTs. The bottom trace for each peptide is the total ion current The elution 
profiles of 3 peptides monitored at each of the mass-to-charge ratios of the b 2 ions from 
the tag fragments are shown. It can be clearly seen that the peptide pairs eltite as a single 
fraction. In MS/MS mode, monitoring of the tag fragment ions produces virtually 
identical results in each case. For each peptide pair the observed ratios matched the 
expected ratios to a reasonable degree. 

Since the tagged peptides exactly co-elute, the ratios of the peptide pairs are conserved 
throughout the elution profile, which means that it is not necessary to integrate the total 
ion current for the eluting ions to determine the relative abundance of each peptide pair. 

Example 3 c - Analysis of the sensitivity and robustness of the TMT technology 

To provide an effective improvement over conventional isotope labelling, the TMT 
technology must be at least as sensitive as other isotope labelling methods and must have 
a broadly similar dynamic range. In addition, the properties of the tags must be consistent 
over the whole expected dynamic range of the samples to be analysed. Finally, the ability 
of these tags to overcome noise in the mass spectrometer needed to be demonstrated. To 
test the dynamic range of the system and to show that the properties of the TMT tags are 
consistent over the entire dynamic range, the conservation of peptide ratios was examined 
at a range of different concentrations of one of the tagged synthetic peptides (peptide 3A 
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and 3B). As can be seen from Figure 25, a serial dilution of peptides 3A and 3B, mixed in 
a ratio of 40:60, from 100 pmoles to 1 00 fmoles, the ratios ware reliably conserved with a 
deviation within 5% in most cases, from the expected ratio. These and other results (not 
shown) indicate that the tag peptides do not reduce the intrinsic sensitivity with which a 
peptide is detected in the MS/MS mode, i.e. the analysis of TMT labelled peptides by CID 
has essentially the same sensitivity as the MS/MS of untagged peptides. The intrinsic 
sensitivity seems to be instrument specific based on comparisons between the LCQ and 
QTOF in the analysis of small peptides (the tag fragments from large peptides labelled 
with TMTs cannot be detected on the LCQ because of the intrinsic limitations on CID 
with this type of instrument). The sensitivity with which it is possible to determine the 
sequence of tagged peptides does not sefcm to be have been significantly changed in any 
of the peptides tested so far. Meaningful differences in the ratios of the peptides can be 
detected over the entire range of concentrations tested (Figure 25). 

Figures 26a 26b and 26c show the results of a spiking experiment in which peptides pairs 
3A and 3B (500 finol in total, in a ratio of 40:60 respectively) bearing a second generation 
TMT was mixed with a tryptic digest of Bovine Serum Albumin (2 pmol). Figure 26a 
shows the base peak chromatogram from analysis in the MS-mode. During the run, the 
first five most intensive ions analysed in MS mode were automatically fragmented in the 
MS/MS mode at 30V. The TMT peptides pairs were investigated and located on the base 
peak chromatogram. The ratio of the TMT2 fragments was then calculated from the 
MS/MS spectrum for the mass [M+3H] (a zoom of the tag fragments is shown in figure 
9b and the whole spectrum shown in figure 9c) by comparing the intensity of the dO and 
d3 TMT fragment mass-to-charge ratios (287 and 290). 

In a further experiment, the ability to detect labelled peptides in a background of 
contaminating peptides was examined. The peptides pairs 3A and 3B bearing a second 
generation TMT was mixed with a 20-fold excess of a tryptic digest of Bovine Serum 
Albumin. The peptide mixture was then analysed in an LC-QTOF instrument The five 
most intense ions from each elution scan were subjected to CID to identify the peptides. 
The expected peptides were detected and the region of spectrum corresponding to the tag 
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fragments was analysed to determine the abundance ratio of the detected peptides. 
Analysis by CID (collision energy of 30V), provides the spectrum shown in Figure 26c. 
The ratio of the peptides 3A and 3B was found to be 39.3% to 60.7% respectively, by 
comparison of the peak intensities at the fragment ion mass-to-charge ratios of 290 (d3 
TMT unit) and 287 (dO TMT unit). The expected ratio was 40% 3A to 60% 3B, thus the 
peptide ratio was detected with a 1.7% error. The quality of the MS/MS spectrum 
obtained (Figure 26b and 26c) at the low collision energy used, allows a clear 
identification of the peptide sequence by database searching. This experiment clearly 
shows that a complex mixture of tryptic peptides does not hinder the analysis of peptide 
pgirs labelled with the 2 nd generation TMT tags and the TMTs can help to overcome noise 
in the sample. In addition there do not seem to be any suppression problems - ratios of 
peptides present in low concentrations can still be determined in the presence of other 
peptides that are in high concentrations. 
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CLAIMS: 

1- A set of two or more mass labels, each label in the set comprising a mass marker 
moiety attached via a cleavable linker having at least one amide bond to a mass 
normalisation moiety, wherein the aggregate mass of each label in the set may be the 
same of different and the mass of the mass marker moiety of each label in the set may be 
the same or different, and wherein in any group of labels within the set having a mass 
marker moiety of a common mass each label has an aggregate mass different from all 
other labels in that group, and wherein in any group of labels within the set having a 
common aggregate mass each label has a mass marker moiety having a mass different 
from that of all other mass marker moieties in that group, such that all of the mass labels 
in the set are distinguishable from each other by mass spectrometry, and wherein the mass 
marker moiety conqxrises an amino acid and the mass normalisation moiety comprises an 
amino acid. 

2. A set of mass labels according to claim 1, in which each label in the set comprises 
a mass marker moiety having a common mass and each label in the set has a unique 
aggregate mass. 

3. A set of mass labels according to any preceding claim, in which each label in the 
set comprises a mass marker moiety having a unique mass and each label in the set has a 
common aggregate mass. 

4. A set of mass labels according to claim 3, in which each mass marker moiety in 
the set has a common basic structure, and each mass normalisation moiety in the set has a 
common basic structure that may be the same or different from the common basic 
structure of the mass marker moieties, and wherein each mass label in the set comprises 
one or more mass adjuster moieties, the mass adjuster moieties being attached to or 
situated within the basic structure of the mass marker moiety and/or the basic structure of 
the mass normalisation moiety, such that every mass marker moiety in the set comprises a 
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different number of mass adjuster moieties and every mass label in the set has the same 
number of mass adjuster moieties. 

5. A set of mass labels according to claim 4, each mass label in the set having the 
following structure: 

M(A)y-I^X(A) z 

wherein M is a mass normalisation moiety comprising an amino acid, X is a mass marker 
moiety comprising an amino acid, A is a mass adjuster moiety, L is a cleavable linker 
comprising the amide bond, y and z are integers of 0 or greater, and y+z is an integer of 1 
or greater. 

6. A set of mass labels according to claim 4 or claim 5, wherein the mass adjuster 
moiety is selected from: 

(a) an isotopic substituent situated within the basic structure of the mass 
marker moiety and/or within the basic structure of the mass normalisation moiety, and 

(b) substituent atoms or groups attached to the basic structure of the mass 
marker moiety and/or attached to the basic structure of the mass normalisation moiety. 

7. A set of mass labels according to claim 6, wherein the mass adjuster moiety is 

selected from a halogen atom substituent, a methyl group substituent, and or l^C 
isotopic substituents. 

8. A set of mass labels according to claim 7, wherein the mass adjuster moiety is a 
fluorine atom substituent. 

9. A set of mass labels according to any preceding claim, wherein the cleavable 
linker attaching the mass marker moiety to the mass normalisation moiety is a linker 
cleavable by collision induced dissociation. 
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10. A set of mass labels according to any preceding claim, wherein the cleavable 
linker comprises proline and/or aspartic acid. 

11. A set of mass labels according to any preceding claim, wherein the mass marker 
moiety and/or the mass normalisation moiety comprises a fragmentation resistant group. 

12. A set of mass labels according to any preceding claim, wherein the mass marker 
moiety comprises a sensitivity enhancing group, such as a pre-ionised group. 

13. A set of mass labels according to any preceding claim, wherein the mass marker 
moiety or the mass normalisation moiety comprises a reactive functionality. 

14. A set of mass labels according to any preceding claim, wherein each mass label in 
the set comprises an affinity capture ligand, such as biotin. 

15. A set of two or more analytes, each analyte in the set being different and being 
attached to a unique mass label or a unique combination of mass labels, from a set of mass 
labels as defined in any of claims 1-14. 

16. A set of analytes according to claim 15, wherein one or more analytes in the set is 
a standard analyte having a known mass, or known chromatographic properties. 

17. A set of two or more probes, each probe in the set being different and being 
attached to a unique mass label or a unique combination of mass labels, from a set of mass 
labels as defined in any of cl aims 1-14. 

18. A set of analytes or probes according to any of claims 15-17, wherein each analyte 
or probe is attached to a unique combination of mass labels, each combination being 
distinguished by the presence and absence of. each mass label in the set of mass labels 
and/or the quantity of each mass label attached to the probe. 
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19. A set of analytes or probes according to any of claims 15-18, wherein each analyte 
or probe comprises a biomolecule. 

20. A set of analytes or probes according to claim 19, wherein the biomolecule is 
selected from a DNA, an RNA, an oligonucleotide, a nucleic acid base, a protein and/or 
an amino acid. 

21. A method of analysis, which method comprises detecting an analyte by identifying 
by mass spectrometry a mass label or a combination of mass labels relatable to the 
analyte, wherein the mass label is a mass label from a set of mass labels as defined in any 
of claims 1-14. 

22. A method according to claim 21, wherein the mass labels employed are labels 
comprising an affinity capture ligand, and labelled analytes are separated from unlabelled 
analytes by capturing the affinity capture ligand with a counter ligand. 

23. A method according to claim 21 or claim 22, in which two or more analytes are 
detected by simultaneously identifying their mass labels or combinations of mass labels 
by mass spectrometry. 

24. A method according to any of claims 21-23, wherein each analyte is identified by a 
unique combination of mass labels from a set or array of mass labels, each combination 
being distinguished by the presence and absence of each mass label in the set or array 
and/or the quantity of each mass label. 

25. A method according to any of claims 21-24 for identifying two or more analytes, 
wherein the analytes are separated according to their mass, prior to detecting their mass 
labels by mass spectrometry. 
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26. A method according to claim 25, wherein the analytes to be identified are mixed 
with one or more standard analytes, having known mass or known properties in the 
separation method used, to facilitate the characterisation of the analytes. 

27. A method according to claim 26, wherein the standard analytes are as defined in 
claim 16. 

28. A method according to any of claims 25-27, wherein separation is carried out by a 
chromatographic or electrophoretic method 

29. A method according to any of claims 21-28, wherein the mass spectrometer 
employed to detect the mass label comprises one or more mass analysers, which mass 
analysers are capable of allowing ions of a particular mass, or range of masses, to pass 
through for detection and/or are capable of causing ions to dissociate. 

30. A method according to claim 29, wherein ions of a particular mass or range of 
masses specific to one or more known mass labels are selected using the mass analyser, 
the selected ions are dissociated, and the dissociation products are detected to identify ion 
patterns indicative of the selected mass labels. 

31. A method according to claim 29 or claim 30, wherein the mass spectrometer 
comprises three quadrupole mass analysers. 

32. A method according to claim 31, wherein a first mass analyser is used to select 
ions of a particular mass or mass range, a second mass analyser is used to dissociate the 
selected ions, and a third mass analyser is used to detect resulting ions. 

33. A method according to any of claims 21-32, which method comprises: 

(a) contacting one or more analytes with a set of probes, wherein the probes 
are as defined in any of claims 17-20, 

(b) identifying an analyte, by detecting a probe relatable to that analyte. 
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34. A method according to claim 33, wherein the mass label is cleaved from the probe 
prior to detecting the mass label by mass spectrometry. 

35. A method according to claim 33 or claim 34, which method comprises contacting 
one or more nucleic acids with a set of hybridisation probes. 

36. Use of a mass label from a set of labels as defined in any of claims 1-14, in a 
method of analysis. 

37. Use according to claim 36 in a method of 2-dimensional electrophoretic analysis. 

38. Use according to claim 36 in a method of 2-dimensional mass spectrometric 
analysis. 

39. Use according to any of claims 36-38 in a method of sequencing one or more 
nucleic acids. 

40. Use according to any of claims 36-38 in a method of gene expression profiling. 

41 . Use according to any of claims 36-38 in a method of protein expression profiling. 

42. Use according to any of claims 36-38 in a method of nucleic acid sorting. 
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